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■ SARS VIRUS NUCLEOTIDE AND AMINO ACID 
SEQUENCES AND USES THEREOF 

Field of the Invention 
The invention is in the field of virology. More specifically, the invention is in 
the field of coronaviruses. 

i 

Background of the Invention 
Severe acute respiratory syndrome (SARS), a worldwide outbreak pf atypical 
pneumonia with an,overall mortality rate of about 3 to 6%, has been attributed to a 
coronavirus following tests of causation according to Koch's postulates, including 

* 

monkey inoculation (R. Munch, Microbes Infect 5, 69-74, Jan. 2003). The 
coronaviruses are members of a family of enveloped viruses that replicate in Ihe 
cytoplasm of animal host cells (B. N. Fields et al., Fields virology, Lippincott Williams 
& Wilkins, Philadelphia, 4* ed., 2001). They are distinguished by the presence of a 
single-stranded plus sense RNA genome, approximately 30 kb in length, that has a 5* 
cap structure and 3' polyA tract. Hence the genome is essentially a very large mRNA. 
Upon infection of an appropriate host cell, the 5 '-most open reading firame (ORF) of 
the viral genome is translated into a large polyprotein that is cleaved by viral-encoded 
proteases to release several nonstructural proteins including an RNA-dependent RNA 
polymerase (Pol) and an ATPase heUcase (Hel). These proteins in turn are responsible 
for replicating the viral genome as weU as generating nested transcripts that are used in 
the synthesis of the viral proteins. The mechanism by which these subgenomic mRNAs 
are made is not fully understood, however transcription regulating sequences (TRSs) at 
the 5'end of each gene may represent signals that regulate the discontinuous 
transcription of subgenomic niRNAs (sgmRNAs). The TRSs include a partially 
conserved core sequence (CS) that in some coronaviruses is 5'-CUAAAC-3\ Two 
major models have been proposed to explain the discontinuous transcription in 
coronaviruses and arterioviruses (M.M.C.Lai, D. Cavanagh, Adv Virus Res. 
48,1(1997); S. G. Sawicki, D.L. Sawicki,Adv.Exp. Med Biol.440,215(1998)). The 
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discovery of transcriptionally active, subgenomic-size minus strands containing the 
antileader sequence and transcription intermediates active in the synthesis of nfiRNAs 
(D. L. Sawicki et al., J. Gen Virol 82,386 (2001); S. G. Sawicki, D.L. Sawicki, J. Virol. 
64,1050 (1990); M. Schaad, R.S J. Baric,J. ViroL68,8 169(1 994); P. B..Sethna et al., 
5 Proc. Natl. Acad. Sci. U.S.A. 86,5626 (1989) ) favors the model of disbontinuous 
transcription during the minus strand synthesis(S. G. Sawicki, D.L. Sawicki,Adv.Exp. 
MedBiol.440,215(1998)). 

The coronaviral membrane proteins, including the major protems S (Spike) and 
M (Membrane), are inserted into the endoplasmic reticulum Golgi intermediate 

10 compartment (ERGIC) while full length replicated RNA (+ strands) assemble with the 
N (nucleocapsid) protein. This RNA-protein complex then associates with the M 
protein embedded in the membranes of the ER and virus particles form as the 
nucleocapsid complex buds into the ER. The virus then migrates througji the Golgi 
complex and eventually exits the cell, likely by exocytosis (B. N. Fields et al.. Fields 

15 virology, Lippincott Williams & Wilkins, Philadelphia, 4* ed., 2001). The site of viral * 
attachment to the host cell resides within the S protein. 

The coronaviruses include a large number of viruses that infect different animal 
species. The predominant diseases associated with these viruses are respiratory and 
enteric infections, although hepatic and neurological diseases also occur with some 

20 viruses. Coronaviruses are divided into three serotypes, Types 1, 11 and HI. 

Phylogenetic analysis of coronavirus sequences also identifies three main classes of 
these viruses, corresponding to each of the three serotypes. Type 11 coronaviruses 
. contain a hemagglutinin esterase (HE) gene homologous to that of Influenza C virus. It 
is presumed that the precursor of the Type II coronaviruses acquired HE as a result of a 

25 recombination event within a doubly infected host cell. 

■ 

In view of the rapid worldwide dissemination of SARS, which has the potential 
of creating a pandemic, along with its alarming morbidity and mortality rates, it would 
be useful to have a better understanding of this coronavirus agent at the molecular level 
to provide diagnostics, vaccines, and therapeutics, and to support public health control 
30 measures. 
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Sutniparv of the Invention 

I 

t 

In general, the invention provides the genomic sequence of a novel coronavirus, 
the SARS virus, and provides novel nucleic acid molecules encoding novel proteins 
that may be used, for. example, for the diagnosis or therapy of a variety of SARS virus- 
5 related disorders. 

In one aspect, the invention provides a substantially pure SARS virus nucleic 
acid molecule or fragment thereof, for example, a genomic RNA or DNA, cDNA, 
synthetic DNA, or mRNA molecule. In some anbodiments, the nucleic' acid molecule 
includes a sequence substantially identical to any of the sequences of SEQ ID NOs: 1- 

10 13, 15-18, 20-30, 90-159, 208, 209. In some embodiments, the nucleic acid molecule 
includes a sequence from SEQ ID NO: 1, SEQ ID N0:2, or SEQ ID NO: 15 or a 
fragment of these sequences. In alternative embodiments, the nucleic acid molecule 
may include a sequence substantially identical to SEQ ID NO: 1, SEQ ID NO:2, or 
SEQ ID NO: 15, or a fragment thereof In alternative embodiments, the nucleic acid 

15 molecule may include a s2m motif (for example, a s2m sequence substantially identical 
to any of the sequence of SEQ ID NOs: 1 6, 1 7, and 1 8), a leader sequence (for 
example, a sequence substantially identical to the sequence of SEQ ID NO: 3), or a 
transcriptional regulatory sequence (for example, a sequence substantially identical to 
any of the sequence of SEQ ID NOs: 4-13 and 20-30). In alternative embodiments, the 

20 nucleic acid molecule includes a sequence substantially identical to any of the 
sequences of nucleotides 265-13,398; 13,398-21,485; 21,492-25,259; 25,268 - 
26,092; 25,689 - 26,153; 26,1 17 - 26,347; 26,398 « 27,063; 27,074 - 27,265; 27,273 - 
27,641; 27,638-27,772; 27,779-27,898; 27,864-28,118; 28,120-29,388; 28,130- 
28,426; 28,583 - 28,795; and 29,590 - 29,621 of SEQ ID NO: 1 5. In alternative 

25 embodiments, the nucleic acid molecule may encode a polyprotein or a polypeptide. In 
alternative embodiments, the invention provides a nucleic acid molecule including a 
sequence complementary to a SARS virus nucleotide sequence. 

In an alternative aspect, the invention provides a substantially pure SARS virus 
polypeptide or fragment thereof, for example, a polyprotein, glycoprotein (for example, 

30 a matrix glycoprotein that may include a sequence substantially identical to the 
sequence of SEQ ID NO: 34), a transmembrane protein (for example, a 
multitransmembrane protein, a type I transmembrane protein, or a type II . 
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transmembrane protein), 'a RNA binding protein, or a viral envelope protein. In 
alternative embodiments, the invention provides a replicase la protein, replicase lb 
protein, a spike glycoprotein, a small envelope protein, a matrix glycoprotein, or a 
nucleocapsid protein; In alternative embodiments, the invention provides a nucleic acid 
S molecule encoding a S ARS virus polypeptide. In alternative embodiments, the SARS 
virus polypeptide includes an identifiable signal sequence (for example, a signal 
sequence substantially identical to the sequence of SEQ ED NOs: 76 or 85), a 
transmembrane domain (for example, a transmembrane domain substantially identical 
to any of the sequences of SEQ ID NOs: 77-86), a transmembrane anchor, a 

10 transmembrane helix, an ATP-binding domain, a nuclear localization signal, a 

hydrophilic domain, (for example, a hydrophilic domain substantially identical to the 
sequence of SEQ ID NOs: 87), or a lysine-rich sequence (for example, a sequence 
substantially identical to the sequence of SEQ ID NO: 14). In alternative embodiments, 
the SARS virus polypeptide may include a sequence substantially identical to any of 

1 5 the sequences of SEQ ID NOs: 14, 33-36, 64-74, and 76-87. 

In alternative embodiments, the invention provides a vector (for example, a 
gene therapy vector or a cloning vector) including a SARS virus nucleic acid molecule 
(for example, a molecule including a sequence substantially identical to any of the 
sequences of SEQ ID NOs: 1-13, 15-18, 20-30, 90-159, 208, 209), or a host cell (for 

20 example, a mammalian cell, a yeast, a bacterium, or a nematode cell) including the 
vector. 

In alternative embodiments, the invention provides a nucleic acid molecule 
having substantial nucleotide sequence identity (for example, 30%, 40%, 50%, 60%, 
70%, 80%, 90% or 1 00% complementarity) to a sequence encoding a SARS virus 

25 polypeptide or fi-agment thereof, for example where the firagment includes at least six 
amino acids, and where the nucleic acid molecule hybridizes under high stringency 
conditions to at least a portion of a SARS virus nucleic acid molecule. 

In alternative embodiments, the invention provides a nucleic acid molecule 
having substantial nucleotide sequence identity (for example, 30%, 40%, 50%, 60%, 

30 70%, 80%, 90% or 100% complementarity) to a SARS virus nucleotide sequence, for 
example where the nucleic acid molecule includes at least ten nucleotides, and where 



wo 2004/096842 . PCT/CA2004/000626 

5 

» 

the nucleic acid molecule hybridizes under high stringency conditions to at least a 
portion of a SARS virus nucleic acid molecule. 

In alternative embodiments, the invention provides a nucleic acid molecule 
comprising a sequence that is antisense to a SARS virus nucleic acid molecule, or an 
5 antibody (for example, a neutralizing antibody) that specifically binds to a SARS virus 
polypeptide, . 

In alternative embodiments, the invention provides a method for detecting a 
SARS epitope, such as a virion or polypeptide in a sample, by contacting the sample 

■ 

with an antibody that specifically binds a SARS epitope, such as a virus polypeptide, 
10 and determining whether the antibody specifically binds to the polypeptide. In 

alternative embodiments, the invention provides a method for detecting a SARS virus 
genome, gene, or hopiolog or fi-agment thereof in a sample by contacting a SARS virus 
nucleic acid molecule, for example where the nucleic acid molecule includes at least 

* 

ten nucleotides, with a preparation of genomic DNA fi:om the sample, under 
1 5 hybridization conditions providing detection of DNA sequences having nucleotide 
sequence identity to a SARS virus nucleic acid molecule. In alternative embodiments, 
the invention provides a method of targeting a protein for secretion firom a cell, by 
attaching a signal sequence fi'om a SARS virus polypeptide to the protein, such that the 
protein is secreted from the cell. 

•i 

20 In alternative aspects, the invention provides a method for eliciting an immune 

response in an animal, by identifying an animal infected with or at risk for infection 
with a SARS virus and administering a SARS virus polypeptide or fragment thereof or 
fragment thereof, or administering a SARS virus nucleic acid molecule encoding a 
SARS virus polypeptide or fragment thereof to flie animal. In alternative embodiments, 

25 the administering results in the production of an antibody in the mammal, or results in 
the generation of cytotoxic or helper T-lymphocytes in the mammal. 

> 

In alternative embodiments, the invention provides a kit for detecting the 
presence of a SARS virus nucleic acid molecule or polypeptide in a sample, where the 
kit includes a SARS virus nucleic acid molecule, or an antibody that specifically binds 
30 a SARS virus poljpeptide. 

In alternative aspects the invention provides a method for treating or preventing 
a SARS virus infection by identifying an animal (e.g., a human) infected with or at risk 
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for infection with a SARS virus, and administering a S ARS virus nucleic add molecule 

or polypeptide, or administering a compound that inhibits pathogenicity or replication 

of a SARS virus, to the animal. In alternative embodiments, the invention provides the 

« 

use of a SARS virus nucleic acid molecule or polypeptide for treating or preventing a 
5 SARS virus infection. I 

In alternative aspects the invention provides a method of identifying a 
compound for treating or preventing a SARS virus infection, by contacting sample 
including a SARS virus nucleic acid molecule or contacting a SARS virus polypeptide 
with the compoxmd, where an increase or decrease in the expression or activity of the 
10 nucleic acid molecule or the polypeptide identifies a compound for treating or 
preventing a SARS virus infection. 

In alternative aspects the invention provides a vaccine (e.g., a DNA vaccine) 
including a SARS virus nucleic acid molecule or polypeptide. 

In alternative aspects the invention provides a microarray including a plurality 
15 of elements, wherein each element includes one or more distinct nucleic acid or amino 
acid sequences, and where the sequences are selected from a SARS virus nucleic acid 
molecule or polypeptide, or a antibody that specifically binds a SARS virus nucleic 
acid molecule or polypeptide. 

In alternative aspects the invention provides a computer readable record (e.g., a 
20 database) including distinct SARS virus nucleic acid or amino acid sequences. 

A "SARS virus" is a virus putatively belonging to the coronavirus family and 
identified as the causative agent for sudden acute respiratory syndrome (SARS). A 
SARS virus nucleic acid molecule may include a sequence substantially identical to the 
nucleotide sequences described herein or fi-agments thereof. A SARS virus polypeptide 
25 may include a sequence substantially identical to a sequence encoded by a SARS virus 
nucleic acid molecule, or may include a sequence substantially identical to the 
polypeptide sequences described herein, or fragments thereof. 

A compound is "substantially pure" when it is separated from the components 
that naturally accompany it. Typically, a compound is substantially pure when it is at 
30 least 60%, more generally 75% or over 90%, by weight, of tiie total material in a 

sample. Thus, for example, a polypeptide that is chemically synthesized or produced 
by recombinant technology will be generally be substantially free from its naturally 
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associated components. A nucleic acid molecule may be substantially pure when it is 
not immediately contiguous .with (i.e., covalently linked to) the coding sequences with 
which it is normally contiguous in the naturally occurring genome of the organism from 
which the DNA of the invention is derived. A nucleic add molecule may also be 
S substantially pure when it is isolated from the organism in which it is normally found. 
A substantially pure compound can be obtained, for examjple, by extraction from a 

natural source; by expression of a recombinant nucleic acid molecule encoding a 

« 

polypeptide compound; or by chemical synthesis. Purity can be measured using any 
appropriate method such as column chromatography, gel electrophoresis, HPLC, etc. 

10 A "substantially identical" sequence is an amino acid or nucleotide sequence 

that differs from a reference sequence only by one or more co^iservative substitutions, 
as discussed herein^ or by one or more non-conservative substitutions, deletions, or 
insertions located at positions of the sequence that do not destroy the biological 
frinction.of the amino acid or nucleic acid molecule. Such a sequence can be at least 

15 10%, 20%, 30%, 40%, 50%, 52,5%, 55% or 60% or 75%, or more generally at least 
80%, 85%, 90%, or 95%, or as much as 99% or 1 00% identical at the amino acid or 
nucleotide level to the sequence used for comparison using, for example, the Align 
Program (Myers and Miller, CABIOS, 1989, 4:11-17) or FASTA. For polypeptides, 
the length of comparison sequences maybe at least 4, 5, 10, or IS amino adds, or at 

20 least 20, 25, or 30 amino acids. In alternate embodiments, the length of comparison 
sequences may be at least 35, 40, or 50 amino acids, or over 60,. 80, or 100 amino acids. 
For nucleic acid molecules, the length of comparison sequences may be at least 1 5, 20, 
or 25 nucleotides, or at least 30, 40, or 50 nucleotides. In alternate embodiments, the 
length of comparison sequences may be at least 60, 70, 80, or 90 nucleotides, or over 

25 100, 200, or 500 nucleotides. Sequence identity can be readily measured using publicly 
available sequence analysis software (e.g.. Sequence Analysis Software Package of the 
Genetics Computer Group, University of Wisconsin Biotechnology Center, 1710 
University Avenue, Madison, Wis, 53705, or BLAST software available from the 
National Library of Medicine, or as described herdn). Examples of usefiil software 

30 include the programs Pile-up and PrettyBox. Such software matches similar sequences . 
by assigning degrees of homology to various substitutions, deletions, insertions, and 
other modifications. 
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Alternatively, or additionally, two nucleic acid sequences may be "substantially 
identical" if they hybridize under high stringency conditions. In some embodiments, 
high stringency conditions are, for example, conditions that allow hyljridization 
comparable with the hybridization that occurs using a DNA probe of at least 500 
5 nucleotides in length, in a buffer containing 0.5 M NaHP04, pH 7.2, 7^/o SDS, 1 mM 
EDTA, and 1% BSA (fraction V), at a temperature of eS'^C, or a buffer containing 48% 
formamide, 4.8x SSC, 0.2 M Tris-CI, pH 7.6, Ix Denhardfs solution, 10% dextran 
sulfate, and 0.1% SDS, at a temperature of 42*'C. (These are typical conditions for hi^ 
stringency northern or Southern hybridizations.) Hybridizations may be carried out 

10 over a period of about 20 to 30 minutes, or about 2 to 6 hours, or about 10 to 15 hours, 
or over 24 hours or more. High stringency hybridization is also relied upon for the 
success of numerous techniques routinely performed by molecular biologists, such as 
high stringency PCR, DNA sequencing, single strand conformational polymorphism 
analysis, and in situ hybridization. In contrast to northern and Southern hybridizations, 

15 these techniques are usually performed with relatively short probes (e.g., usually about 
16 nucleotides or longer for PCR or sequencing and about 40 nucleotides or longer for 
in situ hybridization). The high stringency conditions used in these techniques are well 
known to those skilled in the art of molecular biology, and examples of them can be 
found, for example, in Ausubel et al.. Current Protocols in Molecular Biology, John 

20 Wiley & Sons, New York, N.Y., 1 998, which is hereby incorporated by reference. 

The terms "nucleic acid" or "nucleic acid molecule" encompass both RNA (plus 
and minus strands) and DNA, including cDNA, genomic DNA, and synthetic (e.g., 
chemically synthesized) DNA. The nucleic acid may be double-stranded or single- 
stranded. Where single-stranded, the nucleic acid may be the sense strand or the 

25 antisense strand. A nucleic acid molecule may be any chain of two or more covalently 
bonded nucleotides, including naturally occurring or non-naturally occurring 
nucleotides, or nucleotide analogs or derivatives. By ^TtNA" is meant a sequence of 
two or more covalently bonded, naturally occurring or modified ribonucleotides. One 
example of a modified RNA included within this term is phosphorothioate RNA. By 

30 **DNA" is meant a sequence of two or more covalently bonded, naturally occurring or 
modified deoxyribonucleotides. By "cDNA" is meant complementary or copy DNA 
produced from an RNA template by the action of RNA-dependent DNA polymerase 
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(reverse transcriptase). Thus a "cDNA clone" means a duplex DNA sequence 
complementary to an RNA molecule of interest, carried in a cloning vector. 

An "isolated nucleic acid" is a nucleic acid molecule that is free of the nucleic 
acid molecules that .normally flank it in the genome or that is free of the organism in 
5 which it is normally found. Therefore, an "isolated" gene or nucleic acid molecule is in 
some cases intended to mean a gene or nucleic acid molecule which is not flanked by 
nucleic acid molecules which normally (in nature) flank the gene or nucleic acid 
molecule (such as in genomic sequences) and/or has been completely of partially 
purified from other transcribed sequences (as in a cDNA or RNA library). In some 

10 cases, an isolated nucleic add molecule is intended to mean the genome of an organism 
such as a virus. An isolated nucleic acid of the invention may be substantially isolated 
with respect to the complex cellular milieu in which it naturally occurs. In some 
instances, the isolated material will form part of a composition (for example, a crude 
extract containing other substances), buffer system or reagent mix. In other 

1 5 circumstances, the material may be purified to essential homogeneity, for example as 
determined by PAGE or column chromatography such as HPLC. The term therefore 
includes, e.g., a genome; a recombinant nucleic acid incorporated into a vector, such as 
an autonomously replicating plasmid or virus; or into the genomic DNA of a 

prokaryote or eukaryote, or which exists as a separate molecule (e.g., a cDNA or a 

•> 

20 genomic DNA fragment produced by PCR or restriction endonuclease treatment) 
independent of other sequences. It also includes a recombinant nucleic acid which is 
part of a hybrid gene encoding additional polypeptide sequences. Preferably, an 
isolated nucleic acid comprises at least about SO, 80 or 90 percent (on a molar basis) of 
all macromolecular species present. Thus, an isolated gene or nucleic acid molecule can 

25 include a gene or nucleic acid molecule which is synthesized chemically or by 
recombinant means. Recombinant DNA contained in a vector are included in the 
definition of "isolated" as used herein. Also, isolated nucleic acid molecules include 
recombinant DNA molecules in heterologous host cells, as well as partially or 
substantially purified DNA molecules in solution. In vivo and in vitro RNA transcripts 

30 of the DNA molecules of the present invention are also encompassed by "isolated" 
nucleic acid molecules. Such isolated nucleic acid molecules are usefiil in the 
manufacture of the encoded polypeptide, as probes for isolating homologous sequences 
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(e.g., from other species), for gene mapping (e.g., by in situ hybridization with 
chromosomes), or for detecting expression of the nucleic acid molecule in tissue (e.g., 
human tissue, such as peripheral blood), such as by Northern blot analysis. 

Various genes and nucleic acid sequences of the invention may be recombinant 
5 sequences. The term "recombinant" means that something has been retombined, so that 
when made in reference to a nucleic add construct the term refers to a molecule that is 
comprised of nucleic acid sequences that are joined together or produced by means of 
molecular biological techniques. The term "recombinant" when made in reference to a 
protein or a polypeptide refers to a protein or polypeptide molecule which is expressed 

10 using a recombinant nucleic acid construct created by means of molecular biological 
techniques. The term "recombinant" when made in reference to genetic composition 
refers to a gamete or progeny with new combinations of alleles that did not occur in the 
parental genomes. Recombinant nucleic acid constructs may include a nucleotide 
sequence which is ligated to, or is manipulated to become ligated to, a nucleic add 

1 5 sequence to which it is not ligated in nature, or to which it is ligated at a different 
location in nature. Referring to a nucleic acid construct as "'recombinant" therefore 
indicates that the nucleic acid molecule has been manipulated using genetic 
engineering, i.e. by human intervention. Recombinant nucleic add constructs may for 
example be introduced into a host cell by transformation. Such recombinant nucldc 

20 add constructs may include sequences derived from the same host cell species or from 
different host cell species, which have been isolated and reintroduced into cells of the 
host species. Recombinant nucleic acid construct sequences may become integrated, 
into a host cell genome, either as a result of the original transformation of the host cells, 
or as the result of subsequent recombination and/or repair events. 

25 * As used herein, "heterologous" in reference to a nucldc acid or protein is a 

molecule that has been manipulated by human intervention so that it is located in a 
place other than the place in which it is naturally found. For example, a nucleic add 
sequence from one species may be introduced into the genome of another species, or a 
nucleic acid sequence from one genomic locus may be moved to another genomic or 

30 extrachromasomal locus in the same species. A heterologous protein includes, for 
example, a protein expressed from a heterologous coding sequence or a protein 
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I 

expressed from a recombinant gene in a cell that would not naturally express the 
protein. 

By "antisense," as used herein in reference to nucleic acids, is meant a nucleic 
acid sequence that i&.complementary to one strand of a nucleic acid molecule. In some 
5 embodiments, an antisense sequence is complementary to the coding strand of a gene, 

preferably, a SARS virus gene. The preferred antisense nucleic acid molecule is one , 
which is capable of lowering the level of polypeptide encoded by the complementary 
gene when both are expressed in a cell. In some embodiments, the polypeptide level is 
lowered by at least 10%, or at least 25%, or at least 50%, as compared to the 

10 polypeptide level in a cell expressing only the gene, and not the complementary 
antisense nucleic add molecule. 

A ''probe" or "primer" is a single-stranded DNA or RNA molecule of defined 
■sequence that can base pair to a second DNA or RNA molecule that contains a 
complementary sequence (the target). The stability of the resulting hybrid molecule 

1 5 depends upon the extent of the base pairing that occurs, and is affected by parameters 
such as the degree of complementarity between the probe and target molecule, and the • 
degree of stringency of the hybridization conditions. The degree of hybridization 
stringency is affected by parameters such as the temperature, salt concentration, and 
concentration of organic molecules, such as formamide, and is determined by methods 

20 that are known to those skilled in the art. Probes or primers specific for SARS virus 
nucleic acid sequences or molecules may vary in length from at least 8 nucleotides to 
over 500 nucleotides, including any value in between, depending on the purpose for 
which, and conditions under which, the probe or primer is used. For example, a probe 
or primer may be 8, 1 0, 1 5, 20, or 25 nucleotides in length, or may be at least 30, 40, 

25 50, or 60 nucleotides m length, or may be over 100, 200, 500, or 1000 nucleotides in 
length. Probes or primers specific for SARS virus nucleic acid molecules may have 

< 

greater than 20-30% sequence identity, or at least 55-75% sequence identity, or at least 
75-85% sequence identity, or at least 85-99% sequence identity, or 100% sequence 
identity to the nucleic acid sequences described herein. In various embodiments of the 
3 0 invention, probes having the sequences: 5 ATg AAT TAG CAA gTC AAT ggT TAG 
-3', SEQ ID NO: 160; 5'- gAA gCT ATT CgT CAC gTT Gg-3*, SEQ ID NO: 161; 5'- 
CTg TAg AAA ATG CTA gCT ggA g-3', SEQ ID NO: 162; 5'- GAT AAC GAg TGg 
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gTA CAg CTA-3', SEQ ID NO: 163; 5'- TTA TCA CCC gCgAAg AAg CT-3*, SEQ 
ID NO: 164; 5'- CTC TAg TTg CATgAC AgC CCT C-3', SEQ ID NO; 165; 5'- TCg 
TgC gTg gAT TggCTT TgA TgT-3*, SEQ ID NO: 166; 5'-ggg TTg ggA CTA TCC 
TAA gTg TgA-3', SEQ ID NO: 167; 5'-TAA CAC ACA AAC ACC ATC ATC A-3*, 
5 SEQ ID NO: 168; 5'-ggT Tgg gAC TAT CCT AAg TgT gA-3', SEQ 1d NO: 169; 5*- 
CCA TCA TCA gAT AgA ATC ATC ATA-3 SEQ ID NO: 170; 5'- CCT CTC TTg 
TTC TTg CTC gCA-3', SEQ ID NO: 171; 5*- TAT AgT gAg CCg CCA CAC Atg-3', 
SEQ ID NO: 172; 5'-TAACACACAACICCATCATCA-3', SEQ ID NO: 173; 5'- 
CTAACATGCTTAGGATAATGG-S', SEQ ID NO: 174; 5'- 

1 0 GCCTCTCTTGTTCTTGCTCGC-3', SEQ ID NO: 1 75; 5'- 
CAGGTAAGCGTAAAACTCATC-3', SEQ ID NO: 176; 5'- 
TACACACCTCAGCGTTG-3', SEQ ID NO: 177; 5'-CACGAACGTGACGAAT-3', 
SEQ ID NO: 178; 5'-GCCGGAGCTCTGCAGAATTC-3*, SEQ ID NO: 179; 5'- 
CAGGAAACAGCTATGAC TTGCATCACCACTAGTTGTGCCACCAGGTT-3', 

15 SEQ ID NO: 180; 5'- 

TGTAAAACGACGGCCAGTTGATGGGATGGGACTATCCTAAGTGTGA-3', SEQ 
ID NO: 181; 5'- GCATAGGCAGTAGTTGCATC-3' , SEQ ID NO: 182, as well as 
sequences amplified by specific combinations of these probes, maybe excluded from 
specific uses according to the invention. Probes can be detectably-labeled, either 

20 radioactively or non-radioactively, by methods that are known to those skilled in the 
art Probes can be used for methods involving nucleic acid hybridization, such as 
nucleic acid sequencing, nucleic acid amplification by the polymerase chain reactioci, 
single stranded conformational polymorphism (SSCP) analysis, restriction fragment 
polymorphism (RFLP) analysis. Southern hybridization, northern hybridization, in situ . 

25 hybridization, electrophoretic mobility shift assay (EMSA), and other methods that are 
known to those skilled in the art. 

By "complementary*' is meant that two nucleic acid molecules, e.g., DNA or 
RNA, contain a sufficient number of nucleotides that are capable of forming Watson- 
Crick base pairs to produce a region of double- strandedness between the two nucleic 

30 acids. Thus, adenine in one strand of DNA or RNA pairs with thymine in an opposing 
complementary DNA strand^ or with uracil in an opposing complementary RNA strand. 
It will be understood that each nucleotide in a nucleic acid molecule need not form a 
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matched Watson-Crick base pair with a nucleotide in an opposing complementary 
strand to form a duplex. 

By * Vector*' is meant a DNA molecule derived, e.g., from a plasmid, 
bacteriophage, or m^uiunalian or insect virus, or artificial chromosome, that may be 
S used to introduce a polypeptide, for example a S ARS virus polypeptide, into a host cell 
by means of replication or expression of an operably linked heterologous nucleic add 
molecule. By "operably linked" is meant that a nucleic acid molecule such as a gene 
and one or more regulatory sequences (e.g., promoters, ribosomal binding sites, 
terminators in prokaryotes; promoters, terminators, enhancers in eukaryotes; leader 

10 sequences, etc.) are connected in such a way as to permit the desired function e.g. gene 
expression when the appropriate molecules (e.g., transcriptional activator proteins) are 
bound to the regulatory sequences. A vector may contain one or more unique restriction 
sites and may be capable of autonomous replication in a defined host or vehicle 
organism such that the cloned sequence is reproducible* By *T)NA expression vector*' 

15 is meant any autonomous element capable of directing the synthesis of a recombinant 
peptide. Such DNA expression vectors include bacterial plasmids and phages and 
mammalian and insect plasmids and viruses. A "shuttle vector" is understood as 
meaning a vector which can be propagated in at least two different cell types, or 
organisms, for exaitiple vectors which are first propagated or replicated in prokaryotes 

20 in order for, for example, subsequent transfection into eukaryotic cells. A '*replicon" is 
a unit that is capable of autonomous replication in a cell and may includes plasmids, 
chromosomes (e.g., mini-chromosomes), cosmids, viruses, etc. A replicon may be a 
vector. 

A 'liost cell" is any cell, including a prokaryotic or eukaryotic cell, into which a 
25 replicon, such as a vector, has been introduced by for example transformation, 
transfection, or infection. 

An "open reading frame" or "ORF" is a nucleic add sequence that encodes a 
polypeptide. An ORF may include a coding sequence having i,e., a sequence that is 
capable of being transcribed into mRNA and/or translated into a protein when 
30 combined with the appropriate regulatory sequences. In general, a coding sequence 
includes a 5' translation start codon and a 3' translation stop codon. 



I 
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A **leader sequence" is a relatively short nucleotide sequence located at the 5' 
end of an RNA molecule that acts as a primer for transcription. 

A 'transcriptional regulatory sequence" 'TRS" or "intergenic sequence" is a 
nucleotide sequence that lies upstream of an open reading frame (ORF) and serves as a 
5 template for the reassociation of a nascent RNA strand-polymerase coinplex. 

A "frameshift mutation" is caused hy a shift in a open reading frame, generally 
due to a deletion or addition of at least one nucleotide, such that an alternative 
polypeptide is ultimately translated. 

By "detectably labeled" is meant any means for marking and identifying the 

10 presence of a molecule, e.g., an oligonucleotide probe or primer, a gene or fragment 
thereof, a cDNA molecule, a polypeptide, or an antibody. Methods for detectably- 
labeling a molecule are well known in the art and include, without limitation, 
.radioactive labeling (e.g., with an isotope such as P or S) and nonradioactive 
labeling such as, enzymatic labeling (for example, using horseradish peroxidase or 

15 alkaline phosphatase), chemiluminescent labeling, fluorescent labeling (for example, 
using fluorescein), bioluminescent labeling, antibody detection of a ligand attached to 
the probe, or detection of double-stranded nucleic acid. Also included in this definition 
is a molecule that is detectably labeled by an indirect means, for example, a molecule 
that is bound with a jSrst moiety (such as biotin) that is, in turn, bound to a second 

20 moiety that may be observed or assayed (such as fluorescein-labeled streptayidin). 
Labels also include digoxigenin, luciferases, and aequorin. 

A "peptide," "protein," ''polyprotein" or ^'polypeptide" is any chain of two or 
more amino acids, including naturally occurring or npn-naturally occurring amino acids 
or amino acid analogues, regardless of post-translational modification (e.g., 

25 glycosylation or phosphorylation). An "pplyprotein", "polypeptide", "peptide" or 
"protein" of the invention may include peptides or proteins that have abnormal 
linkages, cross links and end caps, non-peptidyl bonds or alternative modifying groups. 
Such modified peptides are also within the scope of the invention. The term 
"modifying group" is intended to include structures that are directly attached to the 

30 peptidic structure (e.g., by covalent coupling), as well as those that are indirectly 
attached to the peptidic structure (e.g., by a stable non-covalent association or by 
covalent coupling to additional amino acid residues, or mimetics, analogues or 
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derivatives thereof, which m^y fl'ank the core peptidic structure). For example, the 
modifying group can be coupled to the amino-terminus or carboxy-terminus of a 
peptidic structure, or to a peptidic or peptidomimetic region flanking the core domain. 
Alternatively, the nibdifying group can be coupled to a side chiain of at.least one amino 
5 add residue of a peptidic structure, or to a peptidic or peptido-mimetic region flanking 
the core domain (e.g., through the epsilon amino group of a lysyl residue(s), through 
the carboxyl group of an aspartic acid residue(s) or a glutamic acid residue(s), throu^ 
a hydroxy group of a tyrosyl residue(s), a serine residue(s) or a threonine residue(s) or 
other suitable reactive group on an amino acid side chain). Modifying groups 

10 covalently coupled to the peptidic structure can be attached by means and using 

methods well known in the art for linking chemical structures, including, for example, 
amide, alkylamino, carbamate or urea bonds. 

A **polyprotein" is the polypeptide that is initially translated from the genome of 
a plus-stranded RNA virus, for example, a SARS virus. Accordingly, a polyprotein has 

15 not been subjected to post-translational processing by proteolytic cleavage into its 
processed protein products, and therefore, retains its cleavage sites. In some 
embodiments of the invention, the protease cleavage sites of a polyprotein may be 
modified, for example, by amino add substitution, to result in a polyprotein that is 
incapable of being tleaved into its processed protein products. 

■I 

* 

20 An antibody "specifically binds" or "selectively binds*' an antigen whm it 

recognizes and binds the antigen, but does not substantially recognize and bind other 
molecules in a sample, having for example an affinity for the antigen which is 10, 100, 
1000 or 10000 times greater than the affinity of the antibody for another reference 
molecule in a sample. A "neutralizing antibody" is an antibody that selectively 

25 intCTferes with any of the biological activities of a SARS virus polypeptide or 

polyprotein, for example, replication of the SARS virus, or infection of host cells. A 
neutralizing antibody may reduce the ability of a SARS virus polypeptide to carry out 
its specific biological activity by about 50%, or by about 70%, or by about 90% or 
more, or may completely abolish the ability of a SARS virus polypeptide to carry out 

30 its specific biological activity. Any standard assay for the biological activity of any 
SARS virus polypeptide, for example, assays detemiining expression levels, ability to 
infect host cells, or ability to replicate DNA, including those assays described herein or 
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known to those of skill in the art, may he used to assess potentially neutralizing 
antibodies that are specific for SARS virus polypeptides. 

A "signal sequence" is a sequence of amino acids that may be identified, for 
example by homology or biological activity to a peptide sequence with the known 
5 function of targeting a polypeptide to a particular region of the cell. A| signal sequence 
or signal peptide may be a peptide of any length, that is capable of targeting a 
polypeptide to a particular region of the cell. In some embodiments, the signal 
sequence may direct the polypeptide to the cellular membrane so that the polypq>tide 
maybe secreted. In alternate embodiments, the signal sequence may direct the 

■ 

10 polypeptide to an intracellular compartment or organelle, such as the Golgi apparatus, 
or to the surface of a virus, such as the SARS virus. In alternate embodiments, a signal 
sequence may range fi-om about 13 or 15 amino acids in length to about 60 amino acids 
in length. 

A ^transmembrane protein" is an amphipathic protein having a hydrophobic 

15 region ("transmembrane domain") that spans the lipid bilayer of the cell membrane 
. fi-om the cytoplasm to the cell surface, or spans the viral envelope, interspersed 
between hydrophilic regions on both sides of the membrane. The number of 
hydrophobic regions in an amphipathic protein is often proportional to the number of 
times that proteins spans the lipid bilayer. Thus, a single transmembrane protein spans 

20 the lipid bilayer once, and has a single transmembrane domain, while a multi- 

transmembrane protein spans the Upid bilayer multiple times. Multi-transmembrane 
proteins may enable virus entry into a host cell, or act to initiate transduction of a signal 
fi-om the cell surface to the interior of the cell, for example, by a conformational change 
upon ligand binding. A ^transmembrane anchor** is a transmembrane domain that 

25 maintains a polypeptide in its position in the cell membrane or viral envelope and is 
generally hydrophobic. A transmembrane anchor may generally be in the structure of 
an alpha helix, i.e., a "transmembrane helix". Multi-transmembrane proteins may have 
multiple transmembrane alpha-heUces. 

A "nuclear localization signal" is an amino acid sequence that permits the entry 

30 of a polypeptide into the nucleus of a cell through nuclear pores. A nuclear localization 
signal generally has a cluster of positively charged residues, for example, lysines. A 
"lysine-rich sequence" is a sequence having at least two contiguous lysine residues, or 
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I 

at least three contiguous lysipe residues. In some embodiments, a lysine-rich sequence 

♦ 

may be a nuclear localizatiojpL signal. 

An "ATP binding domain" is a consensus domain that is found in many ATP or 
GTP-binding proteilis, and that forms a flexible loop (P-loop) between alpha-helical 
5 and beta pleated sheet domains. The general consensus for an ATP binding domain 

may be (A or G)-XXXXGK-(S or T)- 

A **RNA binding protein" is a protein that is capable of binding to a RNA 

molecule (see, for example, "RNA Binding Proteins: New Concepts in Gene 

Regulation" 1st ed, eds. K. Sandberg and S.E. Mulroney, Kluwers Academic 

10 Publishers, 2001). RNA binding proteins may contain common structural features such 
as arginine-rich tracts, for example, arginines alternating with .aspartates, serines, or 
glycines, or zinc finger regions. RNA binding proteins may also have a common 
ribonucleotide sequence domain. RNA binding proteins are believed to play diverse 
roles in modulating post-transcriptional gene expression. 

15 An "immune response" includes, but is not limited to, one or more of the 

following responses in a mammal: induction of antibodies, B ceUs, T cells (including 
helper T cells, suppressor T cells, cytotoxic T cells, y5 T cells) directed specifically to 
the antigen(s) in a composition or vaccine, following administration of the composition 
or vaccine. An immune response to a composition or vaccine thus generally includes 

20 the development in the host mammal of a cellular and/or antibody-mediated response to 
the composition or vaccine of interest. In general, the immune response will result in 
prevention or reduction of infection by a SARS virus. 

An **iiimaunogenic firagment" of a polypeptide or nucleic acid molecule refers to 
an amino acid or nucleotide sequence that elicits an immune response. Thus, an 

25 immunogenic fragment may include, without limitation, any portion of any of the 
SARS virus sequence^ described herein, or a sequence substantially identical thereto, 
that includes one or more epitopes (the antigenic determinant i.e., site recognized by a 
specific immune system cell, such as a T cell or a B cell). An "epitope" may include 
amino acids in a spatial orientation that they are non-contiguous in the amino acid 

30 sequence but are near each other due to the three dimensional conformation of the 
polypeptide. A epitope may include at least 3, 5, 8, or 10 or more amino acids. 
Immunogenic firagments or epitopes may be identified using standard methods known 
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to those of skill in the art, such as epitope mapping techniques or antigenicity or 
hydropathy plots iising, for example, the Omiga version 1 .0 program from Oxford 
Molecular Group (see, for example, U. S. Patent No. 4,708,871). Immunogenic 
fragments or epitopes may also be identified using methods for determining three 
5 dimensional molecule structure such as X-ray crystallography or nucl4ar magnetic 
resonance. 

A "sample" may be a tissue biopsy, amniotic fluid, cell, blood, serum, plasma, 
urine, stool, sputum, conjunctiva, or any other specimen, or any extract thereof 
obtained from a patient (human or animal), test subject, or experimental animal. A 

10 "sample" may also be a cell or cell line created under experimental conditions, and 
constituents thereof (such as cell culture supematants, cell fractions, infected cells, 
etc.). The sample may be analyzed to detect the presence of a S ARS virus gene, 
genome, polypeptide, nucleic acid molecule or virion, or to detect a mutation in a 
SARS virus gene, expression levels of a SARS virus gene or polypeptide, or the 

15 biological function of a SARS virus polypeptide, using methods that are known in the 
art. For example, methods such as sequencing, single-strand conformational 
polymorphism (SSCP) analysis, or restriction fragment length polymorphism (RFLP) 
analysis of PGR products derived from a sample can be used to detect a mutation in a 
SARS virus gene; ELISA or western blotting can be used to measure levels of SARS 

20 virus polypeptide or antibody affinity; northern blotting can be used to measure SARS 
mRNA levels, or PGR can be used to measure the level of a SARS virus nucleic acid 
molecule. 

Other features and advantages of the invention will be apparent from the 
following description of the drawings and the invention, and from the claims, 

25 

Brief Description of the Drawings 
Figures 1 A-D show phylogenetic analyses of SARS proteins. Unrooted 
phylogenetic trees were generated by clustalw (Thompson, J. D. et al.. Nucleic Acids 
Res 22, 4673-80, Nov 11, 1994) bootstrap analysis using 1000 iterations. Genbank 
30 accessions for protein sequences are as follows: Figure 1 A: Rephcase 1 A: BoCov 
(Bovine Goronavirus):AAL40396, 229E (Human Coronavirus):NP_07355, MHV 
(Mouse Hepatitis Virus):NP_045298, AIBV (Avian Infectious bronchitis 
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virus):CAC391 13, TGEV (Transmissible Gastroenteritis Virus): NP_058423. Figure 
IB: Matrix Glycoprotein: PHEV (Porcine hemagglutinating encephalomyelitis 
virus):AAL80035, BoCov (Bovine Coronavirus):NP_150082, AIBV & AIBV2 (Avian 
infectious bronchitis virus): AAF35863 & AAK83027, MHV O^ouse hepatitis . 
5 virus):AAF36439, TGEV (Transmissible gastroenteritis viTUs):NP_058427, 229E & 
OC43 (Human Coronavirus): NP_073555 & AAA45462, FCV (Feline 
coronavirus):BAC01 160. Figure IC: Nucleocapsid: MHV (Mouse hepatitis 
virus):? 18446, BoCov (Bovine coronavirus):NP_l 50083, AIBV (Avian infectious 
bronchitis virus):AAK27162, FCV (Feline coronavirus) :CAA74230, PTGV (Porcine 

10 transmissible gastroenteritis virus): AAM97563, 229E & OC43 (Human 
coronavirus):NP_073556 & P33469, PHEV (porcine hemagglutinating 
encephalomyelitis virus):AAL80036, TCV (Turkey coronavirus): AAF23 873. Figure 
ID: S (Spike) Protein: BoCov (Bovine coronavirus) 'AAL40400, MHV (Mouse 
hepatitis virus): PI 1225, OC43 & 229B (Human coronavirus):S44241 & AAK32191, 

1 5 PHEV (Porcine hemagglutinating encephalomyelitis virus):AAL8003 1 , PRC (Porcine 
respiratory coronavirus):AAA46905, PEDV (Porcine epidemic diarrhea 
virus):CAA80971, CCov (Canine coronavirus):S41453, FICV (Feline infectious 
peritonitis virus):BAA06805, AIBV (Avian infectious bronchitis virus):AA034396. 
Figure 2 shows a schematic representation of the ORFs and s2m motif in the 

■a 

20 29,736-base SARS virus genome. 

Figures 3A-P show nucleotide sequences of the 29,736-base genome of the 
SARS virus (SEQ ID NOs: 1 and 2). 

Figure 4 shows an alignment of the s2m regions from Avian infectious 
bronchitis virus (AIBV; SEQ ID NO: 32) and equine rhinovirus serotype 2 (ERV-2; 
25 SEQ ID NO: 31) with the 3' untranslated region (UTR; SEQ ID NO: 18) of the SARS 
virus (T0R2). The conserved areas in the s2m region are indicated by asterisks. 

Figure 5 shows the amino acid sequence of the SARS virus S (Spike) 
Glycoprotein (SEQ ID NO: 33). 

Figure 6 shows the amino acid sequence of the SARS virus M (Matrix) 
30 Glycoprotein (SEQ ID NO: 34). 

Figure 7 shows the amino acid sequence of the SARS virus E (Small envelope) 
protein (SEQ ID NO: 35). 
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Figure 8 shows the amino acid sequence of the SARS virus N (Nucleocapsid) 
Protein (SEQ ID NO: 36). 

Figure 9 shows an alignment of the matrix glycoprotein M frqm the SARS 
virus (Tor2_M or 0RF5; SEQ ID NO: 34) and various other matrix glycoproteins (SEQ 
5 ID NOs: 37-43). Asterisks (*) indicate percentage identity to the SARb matrix protein 

4 

as calculated by Align (Myers and Miller, CABIOS (1989) 4:1 1-17). 

Figures lOA-B show an alignment of the nucleocapsid protein N from the 
SARS virus (Tor2JSf; SEQ ID NO: 36) and various other nucleocapsid proteins (SEQ 
ID NOs: 44-52). Asterisks (*) indicate percentage identity to the SARS nucleocapsid 
10 protein calculated by Align (Myers and Miller, CABIOS (1989) 4:11-17) Figures 
1 1 A-K show the nucleotide sequence of the 29,75 1 -base genome of the SARS virus 
(SEQ ID NO: 15). 

Figure 12 shows a schematic representation of the ORFs and s2m motif in the 
29,751 -base SARS virus genome. 

15 Figures 13A-D show phylogenetic analyses of SARS proteins. Unrooted 

phylogenetic trees were generated by clustalw 1.74 (J. D. Thompson, D. G. Higgins, T. 
J. Gibson, Nucleic Acids Res 22, 4673-80 (Nov 1 1, 1994; using the BLOSUM 
comparison matrix and a bootstrap analysis of 1000 iterations. Numbers indicate 
bootstrap replicates supporting each node. Phylogenetic trees were drawn with the 

20 Phylip Drawtree program 3.6a3 (Felsenstein, J. 1993. PHYLIP (Phylogeny Inference 
Package) version 3.5c. Distributed by the author. Department of Genetics, University of 
Washington, Seattle^. Branch lengths indicate the number of substitutions per residue. 
Genbank accessions for protein sequences: A: Replicase 1 A: BoCoV (Bovine 
Coronavirus):AAL40396, HCoV-229E (Human Coronavirus):NP_07355, MHV 

25 (Mouse Hepatitis Virus):NP_045298, IBV (Avian Infectious bronchitis 

vmis):CAC391 1 3, TGEV (Transmissible Gastroenteritis Virus); NP_058423. B: 
Membrane Glycoprotein: PHEV (Porcine hemagglutmating encephalomyelitis 
virus):AAL80035, BoCoV (Bovine Coronavirus):NP_l 50082, IBV & IBV2 (Avian 
infectious bronchitis virus): AAF35863 & AAK83027, MHV (Mouse hepatitis 

30 virus):AAF36439, TGEV (Transmissible gastroenteritis virus):NP_058427, HCoV- 

229E & HCoV-OC43 (Human Coronavirus): NP_073555 & AAA45462, FCoV (Feline 
coronavirus):BAC01 1 60. C: Nucleocapsid: MHV (Mouse hepatitis virus):P 1 8446, 
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BoCoV (Bovine coroiiavirus):NP_l 50083, IBV 1 & 2 (Avian infectious bronchitis 
virus): AAK27162 & NP_040838, FCoV (Feline coronavirus):CAA74230, PTGV 
(Porcine transmissible gastroenteritis virus): AAM97563, HCoV-229E & HCoV-OC43 . 
(Human corpnavirus):NP_073556 & P33469, PHEV (porcine hemaggl^tinating 
5 encephalomyelitis virus):AAL80036, TCV (Tmkey coronavirus):AAF23873. D: S 
(Spike) Protein: BoCoV (Bovine coronavirus):AAL40400, MHV (Mouse hepatitis 
virus): PI 1225, HCoV-OC43 & HCoV-229B (Human coronavirus):S44241 & 
AAK32191, PHEV (Porcine hemagglutinating encephalomyelitis virus):AAL80031, 
PRCoV (Porcine respiratory coronavirus):AAA46905, PEDV (Porcine epidemic 
10 diarrhea virus):CAA80971, CCoV (Canine coronavirus):S41453, FIPV (Feline 
infectious peritonitis virus):BAA06805, IBV (Avian infectious bronchitis . 

■ 

virus):AA034396, , 

Figures 14 A-F show an alignment of the spike glycoprotein S from the SARS 
virus (Tor2_S; SEQ ID NO: 33) and various other spike glycoproteins (SEQ ID NOs: 
15 53-62). Asterisks (*) indicate percentage identity to the SARS spike protein as 
calculated by Align (Myers and Miller, CABIOS (1989) 4:1 M7). 

Figure 15 shows an ahgnment between the SARS virus Small envelope protein 
E (T0R2_E; SEQ ID NO: 35) and the Envelope protein (Protein 4) (XI protein) (ORF 
3) from Porcine transmissible gastroenteritis coronavirus (strain Purdue),Swissprot 
20 accession number P09048 (PGV; SEQ ID NO: 63), as calculated by FASTA 
(http://www.ebi.ac.uk/fasta33/). 

Figures 16A-B show the amino acid sequence of the SARS virus Replicase 1 A 
protein (SEQ ID NO: 64). 

Figure 17 shows the amino acid sequence of the SARS virus Replicase IB 
25 protein (SEQ ID NO: 65). 

Figure 18 shows the amino acid sequence of 0RF3 of SARS virus (SEQ ID 

■ • 

NO: 66). 

Figure 19 shows the amino acid sequence of 0RF4 of SARS virus (SEQ ID 
NO: 67). 

30 Figure 20 shows the amino acid sequence (SEQ ID NO: 68) of 0RF6 

(nucleotides 27059-27247 of the 29,736-base genome sequence) or ORF 7 (nucleotides 
27,074-27,265 of the 29,751 -base genome sequence) of SARS virus. 
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Figure 21 shows the amino acid sequence (SEQ ID NO: 69) of 0RF7 
(nucleotides 27258-27623 of the 29,736-base genome sequence) or ORF 8 (nucleotides 
27,273-27,641 of the 29,751-base genome sequence),of SARS virus. ^ 

Figure 22 shows the amino acid sequence (SEQ ID NO: 70) of 0RF8 
5 (nucleotides 27623-27754 of the 29,736-base genome sequence) or OltF9 8 

(nucleotides 27,638-27,772 of the 29,751-base genome sequence) of SARS virus. 

Figure 23 shows the amino acid sequence (SEQ ID NO: 71) of 0RF9 
(nucleotides 27764-27880 of the 29,736-base genome sequence) or ORFIO (nucleotides 
27,779-27,898 of the 29,751-base genome sequence) of SARS virus. 
10 Figure 24 shows the amino acid sequence (SEQ ID NO: 72) of ORFIO 

(nucleotides 27849-28100 of the 29,736-base genome sequence) or ORFll (nucleotides 
27,864-281 18 of the 29,751-base genome. sequence) of SARS virus. 

Figure 25 shows the amino acid sequence of ORF 13 of SARS virus (SEQ ID 
NO: 73). 

15 Figure 26 shows the amino acid sequence of ORF 14 of SARS virus (SEQ ID 

NO: 74). 

Figure 27 shows an alignment of the secreted region of the SARS virus ORF 10 
of the 29,751-base genome sequence (sars) with the conotoxin from Conus yentricosus 
(conotoxin). Sequence identity is indicated by asterisks and sequence homology is 
20 indicated by dots. 

Detailed Description of the Invention 
In general, the invention provides nucleic acid molecules, polypeptides, and 
25 other reagents derived from a SARS virus, as well as methods of using such nucleic 
acid molecules, polypeptides, and other reagents. 

The genome sequence (Figures 3A-P, llA-K, SEQ ID NOs: 1, 2, and 15) 
reveals that the SARS coronavirus is only moderately related to other known 
coronaviruses, including two human coronaviruses, OC43 and 229E. Thus, the SARS 
30 vuxis is a previously unknown virus. The 5' end of the SARS genome contains a 5' 
leader sequence (Table 1; SEQ ID NO: 3) with sequence similarity to the highly 
conserved coronavirus core leader sequence, 5'-CUAAAC-3 (SEQ ID NO: 75; 
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Sawicki, S. G., et ai., Adv Exp Med Biol 440, 215-9, 1998; Lai, M. M. and D. 
Cavanagh, Adv Virus Res 48, 1 - 1 00, 1 997). Transcriptional regulatory sequences . 
(TRSs) were identified upstream of all open reading fi-ames (ORFs) (Tables 1 and 2; 
SEQ ID NOs: 3-13 arid 20-30). 0RF9 and ORFIO of the 29,736-base SARS genome 

« 

5 (ORF 10 and ORF 1 1 of the 29,751 base genome) overlap by 12 amino acids, and have 

4 

matches to the TRS consensus in close proximity to their respective initiating 
methionine codons. 

The 3 ' UTR sequence (SEQ ID NO: 1 8) of SARS virus contains a s2m region 
having the sequence ACATTTTCATCGAGGCCACGCGGAGTACGAT 

10 CGAGGGTACAGTGAAT; SEQ ID NO: 16) that includes a conserved, 

discontinuous 32 base-pair s2m motif. The conserved 32 base7pair motif is a universal 
feature of astroviruses that has also been identified in avian coronavirus (AJBV) and 
the ERV-2 equine rhinovirus. This motif has been identified by Jonassen CM. et al. (J 
Gen Virol 1998 Apr;79 ( Pt 4):715-8) as GCCGNGGCCACGC(G/C) 

15 GAGTA(C/G)GANCGAGGGTACAG(G/C) (SEQ ID NO: 19), where N is generally 
- not part-of the conserved motif, and can be any nucleotide. The region corresponding 
to the 32 base-pair motif in SARS virus includes the sequence: 
CGAGGCCACGCGGAGTACGATCGAGGGTACAG (SEQ ID NO: 17), and spans 
positions 29590-29621 of the 29,751 base genome. Figure 4 shows an alignment of the 

s 

I 

20 s2m regions fi:om Avian infectious bronchitis virus (AIBV; SEQ ID NO: 32) and 

equine rhinovirus serotype 2 (ERV-2; SEQ ID NO: 31), as defined in Jonassen CM. et 
al. (J Gen Virol 1998 Apr;79 ( Pt 4):715-8), with the entire 3* untranslated region 
(UTR) of the SARS virus (T0R2) (SEQ ID NO: 1 8). 



wo 2004/096842 PCT/CA2004/000626 

24 

Table 1. Listing of the transcription regulatory sequences of the 29,736-base SARS genome, 
showing the nucleotide position (base) and associated open-reading frames (ORF). An asterisk 
(*) indicates consensus sequence. 



Base 


ORF 




4 

1 

(SEQ 








45 


Leader 


TCTCTAAACGAACTTTAAAATCTGTG 


ID 


NO: 


3) 


21464 


S 


CAACTT^CGAACATG 


(SEQ 


ID 


NO: 


4) 


25238 


0RF3 


CACATAAACGAACTTATG 


(SEQ 


ID 


NO: 


5) 


26089 


£ 


TGAGTACGAACTTATG 


(SEQ 


ID 


NO: 


6) 


26326 


M 


GGTCTAAACGAACTAACT 40 ATG 


(SEQ 


ID 


NO: 


7) 


26986 


0RF6 


AACTATAAATT 62 ATG 


(SEQ 


ID 


NO: 


8) 


27244 


0RF7 


TCCATAAAACGAACATG 


(SEQ 


ID 


NO: 


9) 


27575 


0RF8 


TGCTCTA GTATTTTTAATACTTTG 24 ATG 


(SEQ 


ID 


NO: 


10) 


27751 


0RF9 


AGTCTAAACGAACATG 


(SEQ 


ID 


NO: 


11) 


27837 


ORFIO 


CTAATA7VACCTCATG 


(SEQ 


ID 


NO: 


12) 


28084 


N 


TAAATAAACGAACAAATTAAAATG 


(SEQ 


ID 


NO: 


13) 
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Table 2. Listing of the transcription regulatory sequences of the 29,751-base SARS genome, 
showing the nucleotide position (base), associated open-reading frames (ORF), and identified 
transcription regulatory sequences. Numbers in parentheses within the alignment indicate 
distance to the putativjs initiating codon. The conserved core sequence is indicated in bold in 
the putative leader sequence. Contiguous sequences identical to region of the leader sequence . 
containing the core sequence are shaded. No putative TRSs were detected for ORFs 4, 13 and 
14, although ORF 1 3 may share the TRS associated with the N protein. 



ORF TRS Sequence 

10 60 Leader UCUCUAAACGAACUUUAAAAUCUGUG(SEQ ID NO: 20) 

21479 S (Spike) CAACDAAACGi^CAU^ (SEQ ID NO: 21) 

25252 0RF3 CACAtiAAAd6AA|SUUAUG (SEQ ID NO: 22) 

26104 Envelope UGAGUACGAAteUUAUG (SEQ ID NO: 23) 

26341 M GGjUCU2^CGftAg,0AACU (40)AUG(SEQ ID NO: 24) 

15 27001 0RF7 ASi^tlAS|AiS;UD (62)AU6(SEQ ID NO:25) 

27259 0RF8 UCCAUAaAkCGAACAUG (SEQ ID NO: 26) 

27590 0RF9 UG.1(5pcaA---GUAUtj.0UU^^ ID NO: 27) 

27766 ORFIO AGjgCUSAAfGAA^ (SEQ ID NO: 28) 

27852 ORFll CU7VAD^Sa<!^UCAUG (SEQ ID NO: 29) 

20 28099 NUCLEOCAPSID UAAAij^V(5§2ii^^^UUSAAAUG (SEQ ID NO: 30) 

The coding potentials of the 29,736-base and 29,751-base genomes are depicted 
in Figures 2 and 12, respectively. Open reading frames (ORFs) include the Replicase 
la and lb translation products, the Spike glycoprotein, the small Envelope protein, the 

25 Membrane and the Nucleocapsid protein. Construction of unrooted phylogenetic trees 
using this set of known proteins from representatives of the three known coronaviral 
groups reveals that the proteins encoded by the SARS virus do not readily cluster more 
closely with any known group than with any other (Figures 1 A-D and 13A-D). In 
addition, nine novel ORFs have been analyzed. 

30 The Replicase la ORF located at nucleotides 250-13395 of the 29,736-base 

genome, and nucleotides 265-13,398 of the 29,751-base genome, and replicase lb ORF 
located at nucleotides 13395-21467 of the 29,736-base genome, and nucleotides 13,398 
- 21,485 of the 29,751-base genome, occupy 21.2 kb of the SARS virus genome 
(Figures 2 and 12). These genes encode a number of proteins that are produced by 

35 proteolytic cleavage of a large polyprotein (Ziebuhr, J. et al., J Gen Virol 81. 853-79, 
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Apr, 2000). A frame shift mutation inteirupts the protein-coding region, separating the 
1 a and lb open-reading frames. The proteins encoded by the Replicase la and lb 
ORFs are depicted in Figures 16A-B and 17, SEQ ID NOs: 64 and 65). 

The Spike glycoprotein (S) (E2 glycoprotein gene; Figures 2 and 12; 
5 nucleotides 21477 to 25241 of the 29,736-base genome, and nucleoticies 21,492 to 

25,259 of the 29,751 -base genome) encodes a surface projection glycoprotein precursor 
of about 1,255 amino acids in length (Figure 5; SEQ ID NO: 33), which may be 
significant in the virulence of the SARS virus. Mutations in this gene are correlated 
with altered pathogenesis and virulence in other coronaviruses (B. N. Fields et al., 

10 Fields virology (Lippincott Williams & Wilkins, Philadelphia, ed. 4^ 2001). In other 
coronaviruses, the mature spike protein is inserted in the viral envelope with the 
majority of the protein exposed on the surface of the particles. Three molecules of the 
Spike protein form the characteristic peplomers or corona-like structures of this virus 
family. Analysis of the spike glycoprotein with SignalP (Nielson, H. et al., Prot 

15 Engineer, 70:1-6 (1997) indicates a signal peptide (MFIFLLFLTLTSG; SEQ ID NO: 
76)(probability 0.996) with cleavage between residues 13 and 14. TMHMM 
{SonDhmmer,KL.etsX.,ProcIntConfIntellSystMol^^^^ 175-82(1998)) 
indicates a transmembrane domain near the C-terminal end 

(WYVWLGFIAGLIATVMVTILLCC; SEQ ID NO: 183). Together these data indicate 
20 a type I membrane protein with N-terminus and the majority of the protein (residues 
14-1 195) on the outside of the cell-surface or virus particle, which may be responsible 
for binding to a cellular receptor. The SARS virus Spike glycoprotein has limited . 
sequence identity to other, known Spike glycoproteins (Figures 1 4A-F). 

ORF 3 (Figures 2 and 12; nucleotides 25253-26074 of the 29,736-base genome 
25 and nucleotides 25,268 - 26,092 of the 29,75 1 -base genome) encodes a protein of 274 
amino acids (Figure 18; SEQ ID NO: 66) that lacks significant similarities to any 
known protein when analyzed with BLAST (Altschul, S. F, et al., Nucleic Acids Res 25, 
3389-402, Sep 1, 1997), FASTA (Pearson, W. R. and D. J. Lipman, Proc Natl Acad Sci 
USA 85, 2444-8, Apr, 1988) or PFAM (Bateman, A. et al., Nucleic Acids Res 30, 276- 
30 80, Jan 1 , 2002). Analysis of the N-teraiinal 70 ammo acids with SignalP indicates the 
existence of a signal peptide (MDLFMRFFTLRSITAQ; SEQ ID NO: 184) and a 
cleavage site (probability 0.540). Both TMpred (Hofinan, K. and W. Stoffel, Biol 
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Chem, Hoope-Seyler 374, 1 6$ (1 993) and TMHMM indicate three trans-membrane 
regions spanning approximat.ely residues 34-56 (TIPLQASLPFGWLVIGVAFLAVF, 
SEQ ID NO: 77), 77-99 (FQFICNLLLLFVTIYSHLLLVAA, SEQ ID NO: 78), and 
103-125 (AQFLYLYALIYFLQCINACRIIM, SEQ ID NO: 79). Both TMpred pid 

5 TMHMM indicate that the C-terminus and a large 149 amino acid domain is located 
inside flie viral or cellular membrane. The C-terminal (interior) region of the protein, 
corresponding to about amino acids 124-274 
(MRCWLCWKCKSIQ^LLYDAOTFVCWHTHNYDYCff 
STPKLKEDYQIGGYSEDRHSGVKDYVVVHGYFrEVYYQLES^^ 

10 FFIFNKLVKI)PPNVQmTmGSSGVANPAMDPIYDEPTm SEQ ID NO: 
1 85) may encode a protein domain with ATP-binding properties (PD037277). 

ORE 4 (Figure 12; nucleotides 25,689 - 26,153 of the 29,751-base genome) 
encodes a predicted protein of 154 amino acids (Figure 19; SEQ ED NO: 67). This 
ORF overlaps entirely with ORF 3 and the E protein. 0RF4 may be expressed from the 

15 ORF mRNA using an internal ribosomal entry site. BLAST analyses failed to identify 
matching sequences. Analysis with TMPred predicts a single transmembrane helix, 
amino acids 1-20 MMPTTLFAGTHITMTTVYHI, SEQ ID NO: 186. 

TTie small envelope protein E (Fi gures 2 and 1 2; nucleotides 26 1 02-26329 of 
the 29,736-base geftome and nucleotides 26,117 - 26,347, ORF 5, of the 29,751- 

20 genome) encodes a protein of 76 amino acids (Figure 7; SEQ ID NO: 35). BLAST and 
FASTA comparisons indicate that the protein, while novel, is homologous to multiple 
envelope proteins (alternatively known as small membrane proteins) from several 
coronaviruses. An alignment of the SARS virus E protein with the envelope protein of 
Porcine transmissible gastroenteritis coronavirus indicates approximately 28% 

25 sequence identity between the two proteins over a 61 amino acid overlap, as calculated 
by FASTA (Figure 1 5). PFAM analysis of the protein indicates that the small envelope 
protein E is a member of the NS3_EnvE protein family. InterProScan (R. Apweiler et 
al.. Nucleic Acids Res 29. 37-40, Jan 1, 2001 ; Zdobnov, E. M. and R. Apweilor, 
Bioinfonnatics 17, 847-8, Sep, 2001) analysis indicates that the protein is a component 

30 of the viral envelope, and homologs of it are found in other viruses, including 

gastroenteritis virus and murine hepatitis virus. Signal? analysis indicates the presence 
of a transmembrane anchor (probability 0.939). TMpred analysis indicates a similar 
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transmembrane anchor at positions 17-34 (VLLFLAFVVFLLVTLAIL, SEQ ID NO: 
80), which is consistent with the known association of homologous proteins with the 
viral envelope. TMHMM indicates a type II membrane protein with the majority of the 
46 residue C terminus hydrophilic domain ( 
5 TALRLCAYCCNIVhTVSLVOTVYWSRVKNLNSSEG ISEQ ID NO: 

1 87) located on the surface of the viral particle. The E protein may be important for 
viral replicatioa 

The Matrix glycoprotein M (Figures 2 and 12; nucleotides 26383-27045 of the 
29,736-base genome and nucleotides 26,398 - 27,063, ORF 6, of the 29,751 -genome) 

10 encodes a protein of 221 amino acids (Figure 6; SEQ ID NO: 34). BLAST and PASTA 
analysis of the protein, while novel, reveals homologies to coronaviral matrix 
glycoproteins (Figure 9). The association of the spike glycoprotein (S) with the matrix 
glycoprotein (M) may be an essential step in the formation of the viral envelope and in 
the accumulation of both proteins at the site of virus assembly. Analysis of the amino 

15 acid sequence with SignalP indicates a signal sequence (probability 0.932), located at 
approximately residues 1-39 _ . 

(MADNGTITVEELKQLLEQWNLVIGFLFLAWIMLLQFAYS; SEQ ID NO: 188) 
that is imlikely to be cleaved. TMHMM and TMpred analysis both indicate the 
presence of three trans-membrane hehces, located at approximately residues 15-37 

20 (LLEQWNLVIGFLFLAWIMLLQFA; SEQ ID NO: 8 1 ), 50-72 

(LWLWLLWPVTLACFVLAAVYRI; SEQ ID NO: 82) and 77-99 
(GGIAIAMACIVGLMWLSYFVASF; SEQ ID NO: 83), with the 121 amino add 
hydrophilic domain on the inside of the virus particle, where it may interact with 
nucleocapsid. The hydrophilic domain may run from approximately amino acids 

25 PLRGTWTRPLMESELVIGAVIIRGHLRMAGHSLGRCDIKDLPmTVATSRT^ 
YYKLGASQRVGTDSGFAAYNRYRIGNYKLNTDHAGSNDNIALLVQ (SEQ ID 
NO: 189) i.e. approximately amino acids 95 or 99 to 221 of SEQ ID NO: 34. PFAM 
analysis reveals a match to PFAM domain PF01635, and alignments to 85 other 
sequences in the PFAM database bearing this domain, which is indicative of the 

30 coronavirus matrix glycoprotein. 

0RF6 (Figure 2; nucleotides 27059-27247 of the 29,736-base genome 
sequence) or ORF 7 (Figure 12; nucleotides 27,074-27,265 of the 29,751-base genome 
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sequence) encodes a protein of 63 amino acids (Figure 20; SEQ ID NO: 68). TMpred 
analysis indicates a trans-membrane helix located between residues 3 or 4 and 22 
(HLVDFQVTIAEILIIIMRTF; SEQ ID NO: 84), with the N-tenninus located outside 
the viral particle. 

5 Similarly, the gene encoding ORF7 (Figure 2; nucleotides 27258-27623 of the 

29,736-base genome sequence) or ORF 8 (Figure 1 2; nucleotides 27,273-27,641 of the 
29,751 -base genome sequence), encoding a protein of 122 amino acids (Figure 21; SEQ 
ID NO: 69), has no significant BLAST or FASTA matches to known proteins. 
Analysis of this sequence with Signal? indicates a cleaved signal sequence 

10 (MKHLFLTLIVFTSC; SEQ ID NO: 85) (probability 0.995), with the cleavage site 
located between residues 15 and 16. TMpred and TMHMM analysis also indicates a 
trans-membrane helix located approximately at residues 99-1 17 
(SPLFLIVAALVFLILCFTI; SEQ ID NO: 86). Together these data indicate that this 
protein is a type I membrane protein with the major hydrophilic domain of Hit protein 

15 (residues 16-98; ELYHYQECVRGTTVLLKEPCP 

SGTYEGNSPFHPIADNKFALTCrSTHFAFACADGTRHTYQUlARSVSPKIJ'IRQ 
EEVQQELY; SEQ ID NO: 87) and the amino-terminus is oriented inside the lumea of 
the ER/Golgi, or on the surface of the cell membrane or virus particle,depending on the 
membrane localization of the protein. 

20 ORF8 (Figure 2; nucleotides 27623-27754 of the 29,736-base genome 

sequence) or 0RF9 (Figure 12; nucleotides 27,638-27,772 of the 29,751 -base genome 
sequence), encodes a protein of 44 amino acids (Figure 22; SEQ ID NO: 70). FASTA 
analysis of this sequence revealed some weak similarities (37% identity over a 35 
amino acid overlap) to Swiss-Prot accession Q9M883, annotated as a putative sterol«C5 

25 desaturase. A similarly weak match to a hypothetical Clostridium perfringens protein 
(Swiss-Prot accession CPE2366) was also detected. TMpred indicated a single strong 
trans-membrane helix FYLCFLAFLLFLVLIMLIIFWFS, SEQ ID NO: 190, with little 
preference for alternate models in which the N-terminus was located inside or outside 
the particle. 

30 Similarly 0RF9 (Figure 2; nucleotides 27764-27880 of the 29,736-base genome 

sequence) or ORFIO (Figure 12; nucleotides 27,779-27,898 of the 29,751-base genome 
sequence) encoding a protein of 39 amino acids (Figure 23; SEQ ID NO: 71), exhibited 



I 
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no significant matches in BLAST and FASTA searches but encodes a trans-membrane 
helix LLIVLTCISLCSCICTWQ (SEQ ID NO: 191) by TMPred, with the N-tenninus 
located within the viral particle. The region immediately upstream of this protein 
exhibits a strong match to the TRS consensus (Table 2), indicating that a transoipt 
S initiates from this site. The large number of cysteine residues (6) may! result in cross 
linking of the amino acids. Ammo acids ICTWQRCASNKPHVLEDPCKVQH (SEQ 
ID NO: 192) of this protein maybe secreted. The secreted amino acids exhibit 
homology to toxin proteins, for example, to the conotoxin of Conus ventricosus (Figure 
27). Antigenic peptides from the hydrophilic (secreted) region, for example, 

1 0 CiCTWQRCASNKPHVLEDPCK (SEQ ID NO: 193), were used to generate 

monoclonal antibodies using standard techniques. Furthermore, the C terminal amino 
acids form a sequence that shares homology to famesylation sites (CKQH), which 
generally require C terminal location to be ftmctional. This protein may act as a 
virulence factor and/or may facilitate transmission to humans. 

15 ORFIO (Figure 2; nucleotides 27849-28 100 of the 29,736-base genome . 

sequence) or ORFl 1 (Figure 12; nucleotides 27,864-281 18 of the 29,751-base genome 
sequence) encoding a protein of 84 amino acids (Figure 24; SEQ ID NO: 72) exhibited 
only very short (9-10 residues) matches to a region of the human coronavirus E2 
glycoprotein precursor (starting at residue 801). Analysis by Signal? and TMHMM 

20 predict a soluble protein. A detectable alignment to the TRS consensus sequence was 
also found (Table 2). 

The protein (422 amino acids; Figure 8; SEQ ID NO: 36) encoded by the 
Nucleocapsid gene (Figure 2; nucleotides 28105-29370 of the 29,736-base genome 
sequence; Figure 12, nucleotides 28,120-29,388 of the 29,751-base genome sequence) 

25 aligns well with nucleocapsid proteins from other representative coronaviruses (Figures 
lOA-B), although a short lysine rich region (KTFPPTEPKJKDKJCKKTDEAQ; SEQ ID 
NO: 14) is unique to SARS. This region is suggestive of a nuclear localization signal 
Since some coronaviruses are able to replicate in enucleated cells, the SARS virus 
nucleocapsid protein may have evolved a novel nuclear function, which may play a role 

30 in pathogenesis. In addition, the basic nature of this peptide suggests it may assist in 
RNA binding. The SARS nucleocapsid protein is also a good candidate for diagnostic 
tests. 
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ORF 13 (Fig. 12; nucleotides 28,130 - 28,426 of the 29,751-base genome 
sequence) encodes a novel protein of 98 amino acids (Figure 25; SEQ ID NO: 73). 
ORF 14 (Fig. 12; nucleotides 28,583 - 28,795 of the 29,751-base genome sequence) 
encodes a novel Jifbtein of 70 amino acids (Figure 26; SEQ ID NO: 74). TMPred 
5 predicts a single transmembrane helix WAVIQEIQLLAAVGEILLLEW (SEQ ID NO: 
194). 

Various features of the SARS virus genome are summarised in Table 3. While' 
Table 3 refers to the 29,751-base genome sequence, the features are also applicable to 
the 29,736-base genome sequence (SEQ ID NOs: 1 and 2). 

10 

Table 3. Features of the SARS virus 29,751-base genome sequence. 



Feature 


Start -End' 


No. amino acids 


No. bases 


Frame 


TRS 


Orfla 


265 - 13,398 


4.382 


13,149 


+1 


N/A 


Oif lb 


13.398-21,485 


2.628 


7.887 


+3 


N/A 


S protein 


21,492 - 25,259 


1,255 


3.768 


+3 


Strong 


Orf3 


25,268 - 26,092 


274 


825 


+2 


Strong 


Oif4 


25,689 - 26,153 


154 


465 


+3 


Absent 


E protein 


26,117 - 26347 


76 


231 


+2 


Weak 


M protein 


26.398-27,063 


221 


666 


+1 


Strong 


Orf7 


27,074 - 27,265 


63 


192 


+2 


Weak 


OrfS 


27,273-27,641 


122 


369 


+3 


Strong 


Orf9 


27.638 - 27,772 


44 


135 


+2 


Weak 


Oif 10 


27,779 - 27,898 


39 


120 


+2 


Strong 


Orf 11 


27,864 - 28,118 


84 


255 


+3 


Weak 


N protein 


28,120-29,388 


422 


1.269 


+1 


Strong 


Orf 13^ 


28,130 - 28,426 


98 


297 


+2 


Absent* 


Orf 14^ 


28,583 - 28,795 


70 


213 


+2 


Absent 


s2m motif 


29,590-29,621 


N/A 


30 


N/A 


N/A 



1. End coordinates include the stop codon, except for ORF la and s2m. 
2 These ORFs overlap substantially or completely with other and may share TRSs. 
IS N/A indicates not applicable. 

Various polymorphisms may exist in the SARS virus. In the SARS 29,736-base 
genome sequences (SEQ ID NO: 1 or 2), for example, nucleotides 7904, 16607, 19168, 
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24857, or 26842 may be C or T; or nucleotides 19049, 23205, or 25283 may be G or A, 
and in the SARS 29,75 1-base genome sequence (SEQ ID NO: 15), for example, 
nucleotides 7919, 16622, 19183, 24872, or 26857 may be C or T; or nucleotides 19064, 
23220, or 25298 may be G or A. In some embodiments, the nucleotide changes may 
5 result in no change in the encoded amino acid, or in a conservative dr non-conservative 
change in the encoded amino acid. In some embodiments, a nucleotide change, as 
described herein, at position 7904 or 7919, may result in a A to V amino acid 
substitution, in the Replicase lA protein coding region; a change at position 19168 or 
19li83 may result in a V to A amino acid substitution, in the Replicase IB protein 

10 coding region; a change at position 23205 or 23220 may result in a A to S amino acid 
substitution (non-conservative change), affecting the Spike glycoprotein coding region; 
a change at position 25283 or 25298 nxay result in a R to G amino acid substitution 
(non-conservative change), affecting 0RF3; or a change at position 26842 or 26857 
may result in a S to P amino add substitution (non-conservative change), affecting the 

15 Nucleocapsid protein coding region, in tiie SARS 29,736-base (SEQ ID NO: 1 or 2) 
and 29,75 1-base genome (SEQ ID NO: 15) sequences, respectively. In various 
embodiments, a nucleotide or amino acid sequence including a particular 
polymorphism may be selected, for example, for use in the methods of the invention, or 
may be excluded, for example, from a particular use according to the invention. 

20 Various alternative embodiments of the invention are described below. These 

embodiments include, without limitation, identification and use of SARS virus nucleic 
acid and amino acid sequences for diagnostic or therapeutic uses. 

Diagnosis of SARS virus-related disorders 

25 A SARS virus-related disorder is any disorder that is mediated by flie SARS 

virus, or by a nucleic acid molecule or polypeptide derived from the SARS virus. 
Accordingly, SARS virus nucleic acid molecules and polypeptides may be used to 
diagnose and identify a SARS virus-related disorder in a mammal, for example, a 
human or a domestic, farm, wild, or experimental animal. In some embodiments, 

30 SARS virus nucleic acid molecules and polypeptides may be used to screen such 
animals, e.g., civet cats, for the presence of SARS virus. A SARS virus-related 
disorder may be a hepatic, enteric, respiratory, or neurological disorder, and may be 
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accompanied by one or mofc symptoms or indications including, but not limited to, 
fever, cough, shortness of breath, headache, low blood oxygen concentration, liver 
damage, or reduced lymphocyte numbers. Accordingly, samples for diagnosis may be 
obtained from cells, blood, serum, plasma, urine, stool, conjunctiva, sputum, , 

5 asopharyngeal or oropharyngeal swabs, tracheal aspirates, bronchalveolar lavage, 
pleural fluid, amniotic fluid, or any other specimen, or any extract thereof, or by tissue 
biopsy of for example lungs or major organs, obtained from a patient (human or 
animal), test subject, or experimental animal. 

A SARS virus-related disorder may be diagnosed by amplifying a SARS 

10 nucleic acid molecule or fragment thereof from a sample. Probes or primers for use in 
amplification may be prepared using standard techniques. In some embodiments, 
probes or primers are selected from regions of a SARS virus genome as described 
herein that show limited sequence homplogy or identity (e.g., less than 10%, 20%, 
30%, 40%, 50%, 60%, 70%, 80%, 90%, or 100% identity) to other viruses or 

IS pathogens, or to host sequences. 

Nucleic acid sequences can be amplified as needed by methods known in the 
art. For example, this can be accomplished by e.g., polymerase chain reaction **PCR" of 
DNA or of RNA by reverse transcriptase-PCR or "RT-PCR" (See generally PGR 
Technology: Principles and Applications for DNA Amplification (ed. H. A. Erlich, 

20 Freeman Press, NY, N.Y., 1992); PGR Protocols: A Guide to Methods and 

Applications (eds. Innis, et ah, Academic Press, San Diego, Calif., 1990); Mattila et al.. 
Nucleic Acids Res. 19, 4967 (1991); Eckert et al., PGR Methods and Applications 1, 17 
(1991); PGR (eds. McPherson et al.. IRL Press, Oxford); and U.S. Pat. No. 4,683,202 
issued July 28, 1987 to Mullis) Variations of standard PGR techniques, such as for 

25 example real time RT-PGR using internal as well as amplification primers, resulting in 
increased sensitivity and speed, and reduction of risk of sample contamination (see for 
example Higuchi, R., et al., "Kinetic PGR Analysis: Real-time Monitoring of DNA 
Amplification Reactions," Bio/Technology, vol. 11, pp. 1026-1030 (1993); Held et al, 
"Real Time Quantitative PGT", Genome Research, 1996, pp. 986-994; Gibson UE et 

30 al., "A novel method for real time quantitative RT-PGR," Genome Res. 1996 

Oct;6(10):995-1001), or the 'Tacman" approach to ?CR, described by for example 
HoUand et al, Proc. Nati. Acad. Sci., 88: 7276-7280 (1991), may be performed. 
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Other suitable amplificatioii and analytical methods include the single base 
primer extension (see for example U.S. Patent No. 6,004,744), mini-sequencing, ligase 
chain reaction (LCR) (see for example Wu and WaDace, Genomics 4, 560 (1989), 
Landegren et al., Science 241, 1077 (1988), transcription amplification (Kwoh et aL, 
5 Proc. Natl. Acad. Sci. USA 86, 1 173 (1989)), and self-sustained seqilence replication 
(Guatelli et al., Proc. Nat. Acad. Sci. USA, 87, 1874 (1990)) and nucleic acid based 
sequence amplification (NASB A). The latter two amplification methods involve 
isothermal reactions based on isothermal transcription, which produce both single 
stranded RNA (ssRNA) and double stranded DNA (dsDNA) as the amplification 

10 products in a ratio of about 30 or 100 to 1, respectively. 

A SARS virus-related disorder may also be diagnosed using an antibody 
directed against a SARS virus nucleic acid or amino acid sequence that specifically 
binds a nucleic acid molecule or polypeptide. In an alternative embodiment, the 
antibody may be directed against a SARS polypeptide, for example, the S polypeptide 

15 or fragment thereof that is located on the surface of the SARS virion. Methods for 
preparation of antibodies or for assaying antibody binding are well known in the art 

Serological diagnosis may included detection of antibodies against a SARS 
virus polypeptide or nucleic acid molecule, e.g., the Nucleocapsid protein, produced in 
response to infection using techniques such as indirect fluorescent antibody testing or 

20 enzyme-linked immunosorbent assays (BUS A). A SARS virus-related disorder may 
also be diagnosed by for example performing in situ probe hybridization studies on 
tissue specimens. 

In some aspects, diagnostic tests as described herein or known to those of skill 
in the art may be performed for SARS virus variants that exhibit increased 
25 pathogenicity, such as strains having redundant sequences. 

In some embodiments, reagents for diagnosis (e.g, probes, primers, antibodies, 
etc.) may be provided in kits which may optionally include instructions for using the 
reagent or may include other reagents for performing the appropriate assay e.g., 
controls, standards, buffers, etc. 

30 

Therapy or Prophvlaxis for SARS virus-related disorders 
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» 

Compounds according to the invention may also be used to provide therapeutics 
or prophylactics for SARjS virus-related disorders. Accordingly, such compounds may 
be used to treat a manunal, for example, a human or a domestic, farm, wild, or 
experimental animal that has or is at risk for a SARS virus*related disorder* . Such 

5 compounds may include, without limitation, compounds that interfere with SARS vims 
replication, expression of SARS virus proteins, or the ability of the SARS virus to 
infect a host cell. Accordingly, in some embodiments, compounds that act as 
antagonists to SARS virus polypeptides may be used as therapeutics or prophylactics 
for SARS virus related disorders. In some embodiments, purified SARS virus 

10 polypeptides may be used as for example competitive inhibitors to disrupt viral 

function. For example, a Spike protein lacking a functional domain, or having some 
other modification that maintains binding but reduces or eliminates pathogenicity, may 
be used to disrupt viral function. In some embodiments, antibodies that bind SARS 
virus polypeptides or nucleic acid molecules, for example, humanized antibodies, may 

15 be used as therapeutics or prophylactics. 

. . . In some embodiments, the S ARS-virus compounds may be used as vaccines, or 
may be used to develop vaccines. For example, peptides derived from portions of 
S ARS-virus proteins or polypeptides located on the outside of the virion or cell surface 
may be useful for vaccines or for generation of therapeutic or prophylactic antibodies. 

20 A "vaccine" is a composition that includes materials that elicit a desired 

4 

inmiune response. A vaccine may select, activate or expand memory B and T cells of 

the immune system to, for example, enable the elimination of infectious agents, such as 

a SARS virus, or a component thereof. In some embodiments, a vaccine includes a 

suitable carrier, such as an adjuvant, which is an agent that acts in a non-specific 

f 

25 manner to increase the immune response to a specific antigen, or to a group of antigens, 
enabling the reduction of the quantity of antigen in any given vaccine dose, or the 
reduction of the frequency of dosage required to generate the desired inunune response. 

Vaccines according to the invention may include SARS virus polypeptides and 
nucleic acid molecules described herein, or immunogenic fragments thereof. In some 

30 embodiments, a SARS virus Spike polypeptide, Envelope polypeptide, or membrane 
glycoprotein or fragments thereof may be suitable for vaccine applications. In some 
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embodiments, the vaccines may be multivalent and include one or more epitopes fipom a 
SARS virus polypeptide or fragment thereof. 

In some embodiments of the invention, a vaccine may includp a live or killed 
microorganism e.g., a SARS virus or a component thereof* If a live SARS virus is 
5 used, which may be administered in the form of an oral vaccine, is mky contain non- 
revertible genetic alterations (for example, large deletions or insertions in the genomic 
sequence) that reduce or eliminate the virulence of the virus ("attenuated virus'*)* but 
not its induction of an immune response. In some embodiments, a live vaccine noay 
include an attenuated non-SARS microorganism (e.g, bacteria or virus such as vaccinia 

10 vims) that is capable of expressing a SARS virus polypeptide or immunogenic 

fragment thereof as described herein. In some embodiments, a vaccine may include 
SARS virus polypeptides or nucleic acid molecules having modifications that facilitate 
ease of administration. For example, an indigestible SARS virus polypeptide or nucleic 
acid molecule may be used for oral administration, and a modification that is suitable 

1 S for inhalation may be used for adnoinistration to the lung. 

A ^'nucleic acid vaccine" or "DNA vaccine" as used herein, is a nucleic acid 
construct comprising a polynucleotide encoding a polypeptide antigen, particularly an 
antigenic amino acid subsequence identified by methods described herein or known in 
the art. The nucleic acid construct can also include transcriptional promoter elements, 

20 enhancer elements, splicing signals, termination and polyadenylation signals, and other 
nucleic acid sequences. Thus, a nucleic acid vaccine is generally introduced into a 
subject animal using for example one or more DNA plasmids including one or more 
antigen-coding sequences (for example, a SARS virus Envelope polypeptide or 
membrane glycoprotein sequence) that are capable of transfecting cells in vivo and 

25 inducing an inmiune response (see for example Whalen RG et al. DNA-mediated 
immunization and the energetic inmiune response to hepatitis B surface antigen. Clin 
Inamunol Immunopathol 1995;75:M2; Wolff JA et al. Direct gene transfer mto mouse 
muscle in vivo. Science 1990;247:1465-8; Fynan EF et al. DNA vaccines: protective 
immunizations by parental, mucosal, and genegun inoculations. Proc Natl Acad Sd 

30 USA 1993; 90: 1 1478-82). In some embodiments, a library of nucleic acid fragments 
may be prepared by cloning SARS virus genomic DNA into a plasmid expression 
vector using known techniques and the library then used as a nucleic acid vaccine (see 
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for example Barry MA, et al. Protection against mycoplasma infection using 
expression-library immunization. Nature 1995;377;632-5). 

The subject is administered the nucleic acid vaccine using standard methods. 
The vertebrate catl be administered parenterally, subcutaneously, intravenously, 
5 intraperitoneally, intradermally, intramuscularly, topically, orally, rectally, nasally, 
buccally, vaginally, by inhalation spray, or via an implanted reservoir in dosage 
formulations containing conventional non-toxic, physiologically acceptable carriers or 
vehicles. Alternatively, the subject is administered the nucleic acid vaccine through the 
use of a particle acceleration or bombardment instrument (a "gene gun")- The form in 

10 which it is administered (e.g., capsule, tablet, solution, emulsion) will depend in part on 
the route by which it is administered. For example, for mucosal administration, nose 
drops, inhalants or suppositories can be used. The nucleic acid vaccine can be 
administered in conjunction with known adjuvants. The adjuvant is administered in a 
sufficient amount, which is that amount that is sufficient to generate an enhanced 

IS immune response to the nucleic acid vaccine. The adjuvant can be administered prior to 
(e.g., 1 or more days before) inoculation with the nucleic acid vaccine; concurrently 
with (e.g., within 24 hours of) inoculation with the nucleic acid vaccine; 
contemporaneously (simultaneously) with the nucleic acid vaccine (e.g., the adjuvant is 
mixed with the niicleic acid vaccine, and the mixture is administered to the vertebrate); 

^ 

20 or after (e.g., 1 or more days after) inoculation with the nucleic acid vaccine. The 

adjuvant can also be administered at more than one time (e.g., prior to inoculation with 
the nucleic acid vaccine and also after inoculation with the nucleic acid vaccine). As 
used herein, the term "in conjunction with" encompasses any time period, including 
those specifically described herein and combinations of the time periods specifically 

25 described herein, during which the adjuvant can be administered so as to generate an 
enhanced immune response to the nucleic acid vaccine (e.g., an increased antibody titer 
to the antigen encoded by the nucleic acid vaccine, or an increased antibody titer to the 
pathogenic agent). The adjuvant and the nucleic acid vaccine can be administered at 
approximately the same location on the vertebrate; for example, both the adjuvant and 

30 the nucleic acid vaccine are administered at a marked site on a limb of the subject. 

In some embodiments, expression of a SARS virus gene or coding or non- 
coding region of interest may be inhibited or prevented using RNA interference (RNAi) 
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technology, a type of post-transcriptional gene silencing. RNAi may be used to create a 
functional 'Tcnockout", i.e. a system in which the expression of a gene or coding or non- 
coding region of interest is reduced, resulting in an overall reduction of the encoded 
product As such, RNAi may be performed to target a nucleic acid of interest or 

5 fragment or variant thereof, to in turn reduce its expression and the ibvel of activity of 
the product which it encodes. Such a system may be used for therapy or prophylaxis, 
as well as for functional studies. RNAi is described in for example published US patent 
applications 20020173478 (Gewirtz; pubUshed November 21, 2002) and 20020132788 
OLewis et al\ published November 7, 2002). Reagents and kits for performing RNAi 

10 are available commercially from for example Ambion Inc. (Austin, TX, USA) and New 
England Biolabs Inc. (Beverly, MA, USA). 

The initial agent for RNAi in some systems is thought to be dsRNA molecule 
corresponding to a target nucleic acid. The dsRNA is then diought to be cleaved into 
short interfering RNAs (siRNAs) which are 21-23 nucleotides in length (19-21 bp 

15 duplexes, each with 2 nucleotide 3* overhangs). The enzyme thought to effect this first 
cleavage step has been referred to as "Dicer" and is categorized as a member of the 
Rnase III family of dsRNA-specific ribonucleases. Alternatively, RNAi may be 
effected via directly introducing into the cell, or generating within the cell by 
introducing into the cell a suitable precursor (e.g. vector, etc.) of such an siRNA or 

20 siRNA-like molecule. An siRNA may then associate with other intracellular 

components to form an RNA-induced silencing complex (RISQ. The RISC thus 
formed may subsequently target a transcript of interest via base-pairing interactions 
between its siRNA component and the target transcript by virtue of homology, resulting 
in the cleavage of the target transcript approximately 12 nucleotides from the 3' end of 

25 the siRNA. Thus the target mRNA is cleaved and the level of protein product it 
encodes is reduced. 

RNAi may be effected by the introduction of suitable in vitro synthesized 
siRNA or siRNA-like molecules into cells. RNAi may for example be performed usmg 
chemically-synthesized RNA, for which suitable RNA molecules may chemically 

30 synthesized using known methods. Alternatively, suitable expression vectors may be 
used to transcribe such RNA either in vitro or in vivo. In vitro transcription of sense 
and antisense strands (encoded by sequences present on the same vector or on separate 
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vectors) may be effected using for example T7 RNA polymerase, in which case the 
vector may comprise a suitable coding sequence operably-linked to a T7 promoter. The 
in v//n?-transcribed RNA may in embodiments be processed (e.g. using E. coli RNase 
HQ in vitro to a slit conducive to RNAi. The sense and antisense transcripts combined 

■ 

5 to form an RNA duplex which is introduced into a target cell of interest Other vectors 
may be used, which express small hairpin RNAs (shRNAs) which can be processed 
into siRNA-like molecules. Various vector-based methods are known in the art. 
Various methods for introducing such vectors into cells, either in vitro or in vivo (e.g. 
gene therapy) are known in the art 
10 Accordingly, in an embodiment, expression of a polypeptide including an amino 

acid sequence substantially identical to a S ARS virus sequenpe may be inhibited by 

« 

introducing into or generating within a cell an siRNA or siRNA-like molecule 
corresponding to a nucleic acid molecule encoding the polypeptide or fragment diereof, 
or to an nucleic acid homologous thereto. In various embodiments such a method may 

15 entail the direct administration of the siRNA or siRNA-like molecule into a cell, or use 
of the vector-based methods described above. In an embodiment, the siRNA or 
siRNA-like molecule is less than about 30 nucleotides in length. In a further 
embodiment, the siRNA or siRNA-like molecules are about 21-23 nucleotides in 
length. In an embodiment, siRNA or siRNA-like molecules comprise and 19-21 bp 

20 duplex portion, each strand having a 2 nucleotide 3' overhang. In embodiments, the 
siRNA or siRNA-like molecule is substantially identical to a nucleic acid encoding the 
polypeptide or a fragment or variant (or a fragment of a variant) thereof. Such a variant 
is capable of encoding a protein having the activity of a S ARS virus polypeptide. In 
embodiments, the sense strand of the siRNA or siRNA-like molecule is substantially 

25 identical to a S ARS virus nucleic acid molecule or a fragment thereof (RNA having U 
in place of T residues of the DNA sequence). 



SARS Virus Protein Expression 

In general, SARS virus polypeptides according to the invention, may be 
30 produced by transformation of a suitable host cell with all or part of a SARS virus 
polypeptide-encoding genomic or cDNA molecule or fragment thereof (e.g., the 
genomic DNA or cDNAs described herein) in a suitable expression vehicle. Those 
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skilled in the field of molecular biology will understand that any of a wide variety of 
expression systems may be used to provide the recombinant protein. The precise host 
cell used is not critical to the invention. The SARS virus polypeptide may be produced 
in a prokaryotic host (e.g., E. coli or a virus, for example, a coronovirus such as human 
5 OC43 or 229E, a bovine coronavirus, or a virus used for gene therap^, such as an 
adenovirus) or in a eukaryotic host (e.g., Saccharomyces cerevisiae, insect cells, e.g., 
Sf21cells, or mammalian cells, e.g., COS 1, NIH 3T3, VeroE6, or HeLa cells). Such 
cells are available from a wide range of sources (e.g., the American Type Culture 
Collection, Rockland, Md.; also, see, e.g., Ausubel et al.. Current Protocols in 

10 Molecular Biology, John Wiley & Sons, New York, 1994). The method of 

transformation or transfection and the choice of expression vehicle will depend on the 
host system selected. Transformation and transfection methods are described, e.g., in 
Ausubel et al. (supra); expression vehicles may be chosen from those provided, e.g., in 
Qoning Vectors; A Laboratory Manual, P. H. Pouwels et al, 1985, Supp. 1987), or 

IS from commercially available sources. Suitable animal models, e.g. a ferret animal 
model, or any other animal model suitable for analysis of SARS virus infection or 
expression of SARS virus nucleic acid molecules may be used 

In an alternative embodiment, the baculovirus expression system (using, for 
example, the vector pBacPAK9) available from Clontech (Pal Alto, Calif.) may be 

20 used. If desired, this system may be used in conjunction with other protein expression 
techniques, for example, the myc tag approach described by Evan et al. (Mol. Cell Biol. 
5:3610-3616, 1985). In an alternative embodiment, a SARS virus polypeptide may be 
produced by a stably-transfected mammalian cell line, A number of vectors suitable for 
stable transfection of mammalian cells are available to the public, e.g., see Pouwels et 

25 al (supra); methods for constructing such cell lines are also publicly available, e.g., in 
Ausubel et al. (supra). In one example, cDNA encoding the SARS virus polypeptide is 
cloned into an expression vector which includes the dihydrofolate reductase (DHFR) 
gene. Integration of the plasmid and, therefore, the SARS virus polypeptide-encoding 
gene into the host ceU chromosome is selected for by inclusion of 0.01-300 /iM 

30 methotrexate in the cell culture medium (as described in Ausubel et al., supra). This 
dominant selection can be accomplished in most cell types. Recombinant protein 
expression can be increased by DHFR-mediated amplification of the transfected gene. 
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Methods for selecting cell lipes *bearing gene amplifications are described in Ausubel et 
al. (supra); such metiiods generally involve extended culture in medium containing 
gradually increasing levels of methotrexate. DHFR-containing expression vectors 
conamonly used fot this purpose include pCVSEH-DHFR and pAdD26S V(A) . 
5 (described in Ausubel et aL, supra). Any of the host cells described above or, 
preferably, a DHFR-deficient CHO cell line (e.g., CHO DHFR.sup.- cells, ATCC 
Accession No. CRL 9096) are among the host cells preferred for DHFR selection of a 
stably-transfected cell line or DHFR-mediated gene amplification. 

Once the recombinant S ARS virus polypeptide is expressed, it is isolated, e.g., 

10 using affinity chromatography. In one example, an anti-SARS virus polypeptide 

antibody (e.g., produced as described herein) may be attached to a column and used to 
isolate the SARS vims polypeptide. Lysis and fractionation of SARS virus polypeptde- 
harboring cells prior to affinity chromatography may be performed by standard 
methods (see, e.g., Ausubel et al., supra). In another example, SARS virus polypeptides 

15 may be purified or substantially purified from a mixture of compounds such as an 

extract or supernatant obtained from cells (Ausubel et al., supra). Standard purification 
techniques can be used to progressively eliminate undesirable compounds from the 
mixture until a single compound or minimal number of effective compounds has been 
isolated. 

20 Once isolated, the recombinant protein can, if desired, be further purified, e.g., 

by high performance liquid chromatography (see, e.g.. Fisher, Laboratory Techniques 
In Biochemistry And Molecular Biology, eds., Work and Burdon, Elsevier, 1980). 

Polypeptides of the invention, particularly short SARS virus peptide fragments, 
can also be produced by chenaical synthesis (e.g., by the methods described in Solid 
25 Phase Peptide Synthesis, 2nd ed., 1984 The Pierce Chemical Co., Rockford, Dl.). 

These general techniques of polypeptide expression and purification can also be 
used to produce and isolate usefril SARSvirus protein fragments or analogs (described 
herein). 

In certain alternative embodiments, the SARS polypeptide might have attached 
30 any one of a variety of tags. Tags can be amino acid tags or chemical tags and can be 
added for the purpose of purification (for example a 6-histidine tag for purification over 
a nickel column). In other preferred embodiments, various labels can be used as means 
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for detecting binding of a S ARS polypeptide to another polypeptide, for example to a 
cell surface receptor. Alternatively, SARS DNA or RNA may be labeled for detection, 
for example in a hybridization assay. SARS virus nucleic acids or proteins, or 
derivatives thereof, may be directly or indirectly labeled, for example, with a 
S radioscope, a fluorescent compound, a bioluminescent compound, a dhemiluminescent 

4 

compound, a metal chelator or an enzyme. Those of ordinary skill in the art will know 
of other suitable labels or will be able to ascertain such, using routine experimentation. 
In yet another embodiment of the invention, the polypeptides disclosed herein, or 
derivatives thereof, are linked to toxins. 

10 

Isolation and Identification of Additional SARS virus molecules 

Based on the SARS virus sequences described herein, the isolation and 
identification of additional SARS virus-related sequences such as SARS virus genes 
and of additional SARS virus strains or isolates is made possible using standard 

15 techniques. In addition, the SARS virus sequences provided herein also provide the 
- basis for identification of homologous sequences from other species and genera jtom 
both prokaryotes and eukaryotes such as viruses, bacteria, fungi, parasites, yeast, and/or 
mammals.In some embodiments, the nucleic acid sequences described herein may be 
used to design probes or primers, including degenerate oligonucleotide probes or 

20 primers, based upon the sequence of either DNA strand. The probes or primers may 
then be used to screen genonfiic or cDNA libraries for sequences from for example 
naturally occurring variants or isolates of SARS viruses, using standard amplification 
or hybridization techniques. 

In some embodiments, binding partners may be identified by tagging die 

25 polypeptides of the invention (e.g., those substantially identical to SARS virus 

polypeptides described herein) with an epitope sequence (e.g., FLAG or 2HA), and 
delivering it into host cells, either by transfection with a suitable vector containing a 
nucleic acid sequence encoding a polypeptide of the invention, followed by 
inamunoprecipitation and identification of the binding partner. Cells may be infected 

30 with strains expressing the FLAG or 2HA fusions, followed by lysis and 

inmiunoprecipitation with anti-FLAG or anti-2HA antibodies. Binding partners may be 
identified by mass spectroscopy . If the polypeptide of the invention is not produced in 
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sufficient quantities, such a metliod may not deliver enough tagged protein to identify 
its partner. As part of a complementary approach, each polypeptide of the invention 
may be cloned into a mammalian transfection vector fused to, for example, 2HA, GFP 
and/or FLAG. FollbWing transfection, HeLa cells may be lysed and the tagged. 
5 polypeptide immunoprecipitated. The binding partner may be identified by SDS PAGE 
followed by mass spectroscopy. 

In some embodiments, polypeptides or antibodies of the invention may be 

• ■ * 

tagged, produced, and used for example on affinity columns and/orin immunological 
assays to identify and/or confirm identified target compounds. FLAG, HA, and/or His 

10 tagged proteins can be used for such affinity colunms to pull out host cell factors from 
cell extracts, and any hits may be validated by standard bindiijg assays, saturation 
curves, and other methods as described herein or known to those of skill in the art 
In some embodiments, a two hybrid system may be used to study protein- 
protein interactions. The nucleic acid sequences described herein, or sequences 

15 substantially identical diereto, can be cloned into the pBT bait plasmid of the two 
hybrid system, and a conraierciaUy available murine spleen library of 5 x 10^ 
independent clones, may be used as the target library for the baits. Potential hits may 

« 

be further characterized by recovering the plasmids and retransforming to reduce false 

positives resulting' from clonal bait variants and library target clones which activate the 

« 

20 reporter genes independent of the cloned bait. Reproducible hits may be studied further 
as described herein. 

Virulence may be assayed as described herein or as known to those of skill in the art 
Once coding sequences have been identified, they may be isolated using standard 
cloning techniques, and inserted into any suitable vector or replicon for, for example, 

25 production of polypeptides. Such vectors and replicons include, without limitation, 
bacteriophage X (E. poU), pBR322 (E. coli), pACYC177 (E. coli), pKT230 (gram- 
negative bacteria), pGVl 106 (gram-negative bacteria), pLAFRl (gram-negative 
bacteria), pME290 (non-E. coli gram-negative bacteria), pHV14 (E. coli and Bacillus 
subtilis), pBD9 (Bacillus), pU61 (Streptomyces), pUC6 (Streptomyces), YIp5 

30 (Saccharomyces), YCpl9 (Saccharomyces) or bovine papilloma vkus (mammalian 
cells). In general, the polypeptides of the invention may be produced in any suitable 
host cell transformed or transfected with a suitable vector. The method of 
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transformation or transfection and the choice of expression vehicle will depend on the 
host system selected. A wide variety of expression systems may be used, and the 
precise host cell used is not critical to the invention. For example, a polypeptide 
according to the invention may be produced in a prokaryotic host (e.gw, E. coli) or in a 
5 eukaryotic host (e.g., Saccharoinyces cerevisiae^ insect cells, e.g., Sf2ll cells, or 

mammalian cells, e.g., NIH 3T3, HeLa, or COS cells). Such cells are available from a 
wide range of sources (e.g., the American Type Culture Collection, Manassus, VA ). 
Bacterial expression systems for polypeptide production include the E. coli pET 
expression system (Novagen, Inc., Madison, Wis.), and the pGEX expression 
10 system(Pharmacia). 

Compounds 

In one aspect, compounds according to the invention include S ARS virus 
nucleic acid molecules and polypeptides, such as the sequences disclosed in the Figures 

IS and Tables herein, and throughout the specification, and fragments thereof. In 

alternative embodiments, compounds according to the invention may be nucleic acid 
molecules &at are at least 10 nucleotides in length, and that are derived from the 
sequences described herein. In alternative embodiments, compounds according to the 
invention may be peptides that are at least 5 amino acids in length, and that are derived 

20 from the sequences described herein. 

In alternative embodiments, a compound according to the invention can be a 
non-peptide molecule as weU as a peptide or peptide analogue. A peptide or peptide 
analogue will generally be as small as feasible while retaining full biological activity. 
A non-peptide molecule can be any molecule that exhibits biological activity as 

25 described herein or known in the art. Biological activity can, for example, be measured 
in terms of ability to elicit a cytotoxic response, to mediate DNA replication, or any 
other function of a S ARS virus molecule. 

Compounds can be prepared by, for example, replacing, deleting, or inserting an 
amino acid residue of S ARS peptide or peptide analogue, as described herein, with 

30 other conservative amino acid residues, i.e., residues having similar physical, 
biological, or chemical properties, and screening for biological function. 
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It is well known in th^ art that some modifications and changes can be made in 
the structure of a polypeptide without substantially altering the biological function of 
that peptide, to obtain a biologically equivalent polypeptide. Such modifications may 
be made for. the purpose of modifying function, or for facilitating administration or 
5 enhancing stability or inhibiting breakdown for, for example, therapeutic uses. For 
example, an indigestible SARS virus compound according to the invention may be used 
for oral administration; a modification that is suitable for inhalation may be used for 
administration to the lung; or addition of a leader sequence may increase protein 
expression levels. 

10 In one aspect of the invention, SARS virus-derived peptides or epitopes may 

include peptides that differ from a portion of a native leader, protein or SARS virus 

« 

sequence by conservative amino acid substitutions. The peptides and epitopes of the 
present invention also extend to biologically equivalent peptides that differ from a 
portion of the sequence of novel peptides of the present invention by conservative 

15 amino acid substitutions. As used herein, the term "conserved amino acid 

substitutions" refers to the substitution of one amino acid for another at a given location 
in the peptide, where the substitution can be made widiout substantial loss of the 
relevant function. In making such changes, substitutions of like amino acid residues can 
be made on the ba^is of relative similarity of side-chain substituents, for example, their 

20 size, charge, hydrophobicity, hydrophilicity, and the like, and such substitutions may be 
assayed for their effect on the function of the peptide by routine testing. 

In some embodiments, conserved amino acid substitutions may be made where 
an amino acid residue is substituted for another having a sinailar hydrophilicity value 
(e.g., within a value of plus or minus 2.0), where the following may be an amino acid 

25 having a hydropathic index of about -1.6 such as Tyr (-1.3) or Pro (-1.6)s are assigned 
to amino acid residues (as detailed in United States Patent No. 4,554,101, incorporated 
herein by reference): Arg (+3.0); Lys (+3.0); Asp (+3.0); Glu (+3.0); Ser (+0.3); Asn 
(+0.2); Ghi (+0.2); Gly (0); Pro (-0.5); Thr (-0.4); Ala (-0.5); His (-0.5); Cys (-1.0); Met 
(-1.3); Val (-1.5); Leu (-1.8); lie (-1.8); Tyr (-2.3); Phe (-2.5); and Trp (-3.4). 

30 In altemative embodiments, conserved amino acid substitutions may be made 

where an amino acid residue is substituted for another having a similar hydropathic 
index (e.g., within a value of plus or minus 2.0). In such embodiments, each amino acid 
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residue may be assigned a hydropathic index on the basis of its hydrophobicity and 

< 

charge characteristics, as follows: He (+4.5); Val (44.2); Leu (+3.8); Phe (+2.8); Cys 
(+2.5); Met (+1.9); Ala (+1.8); Gly (-0.4); Thr (-0.7); Ser (-0.8); Trp .(-0.9); Tyr (.-13); 
Pro (-1.6); His (-3.2); Glu (-3.5); Gin (-3.5); Asp (-3.5); Asn (-3.5); Lys (-3.9); and Arg 

5 (-4.5). I 

In alternative embodiments, conserved amino acid substitutions may be made 
where an amino acid residue is substituted for another in the same class, where the 
amino acids are divided into non-polar, acidic, basic and neutral classes, as follows: 
non-polar: Ala, Val, Leu, He, Phe, Trp, Pro, Met; acidic: Asp, Glu; basic: Lys, Arg, 

10 His; neutral: Gly, Ser, Thr. Cys. Asn, Ghi, Tyr. 

Conservative amino acid changes can include the substitution of an L-amino 
acid by the corresponding D-amino acid, by a conservative D-amino acid, or by a 
naturally-occurring, non-genetically encoded form of amino acid, as well as a 
conservative substitution of an L-amino acid. Naturally-occurring non-genetically 

15 encoded amino acids include beta-alanine, 3-amino-propionic acid, 2,3-diamino 
propionic acid, alpha-aminoisobutyric acid, 4-amino-butyric acid, N-methylglycine 
(sarcosine), hydroxyproline, ornithine, citruUine, t-butylalanine, t-butylglycine, N- 
methylisoleucine, phenylglycine, cyclohexylalanine, norleucine, norvaline, 2- 
napthylalanine, pyridylalanine, 3-benzothienyl alanine, 4-chlorophenylalanine, 2- 

20 fluorophenylalanine, 3-fluorophenylalanine, 4-fluorophenylalanine, penicillamine, 
1 ,2,3,4-tetrahydro-isoquinoline-3-carboxylix acid, beta-2-thienylalamne, methionine 
sulfoxide, homoarginine, N-acetyl lysine, 2- amino butyric acid, 2-aniino butyric add, 
2,4,-diamino butyric acid, p-aminophenylalanine, N-methylvaline, homocysteine, 
homoserine, cysteic acid, epsilon-amino hexanoic acid, delta-amino valeric acid, or 2,3- 

25 diaminobutyric acid. 

In alternative embodiments, conservative amino acid changes include changes 
based on considerations of hydrophilicity or hydrophobicity, size or volume, or charge. 
Amino acids can be generally characterized as hydrophobic or hydrophilic, depending 
primarily on the properties of the anaino acid side chain. A hydrophobic amino acid 

30 exhibits a hydrophobicity of greater than zero, and a hydrophilic amino acid exhibits a 
hydrophilicity of less than zero, based on the normalized consensus hydrophobicity 
scale of Eisenberg et al (7. Mol. Bio. 179:125-142, 184). Genetically encoded 
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hydrophobic amino acids include Gly, Ala, Phe, Val, Leu, He, Pro, Met and Tip, and 

« 

genetically encoded hydrophilic amino acids include Thr, His, Glu, Gin, Asp, Arg, Ser, 
and Lys. Non-genetically encoded hydrophobic amino acids include t-butylalanine, 
while non-genetically encoded hydrophilic amino acids include citrulline and . 
5 homocysteine. 

Hydrophobic or hydrophilic amino acids can be further subdivided based on the 
characteristics of their side chains. For example, an aromatic amino acid is a 
hydrophobic amino acid with a side chain containing at least one aromatic or 
heteroaromatic ring, which may contain one or more substituents such as -OH, -SH, - 

10 CN, -F, -a, -Br, -I, -NO2, -NO, -NH2, -NHR, -NRR, -C(0)R, -C(0)OH, -C(0)OR, - 
C(0)NH2, -C(0)NHR, -C(0)NRR, etc., where R is independently (Ci-Ca) alkyl, 
substituted (Ci-Ce) ^yl, (Ci-Ce) alkenyl, substituted (CrCe) alkenyl, (Ci-Cc) 
alkynyl, substituted (Ci-Ce) alkynyl, (C5-C20) aryl, substituted (C5-C20) aryl, (Ce-Cae) 
alkaryl, substituted (C6-C26) alkaryl, 5-20 membered heteroaryl, substituted 5-20 

15 membered heteroaryl, 6-26 membered alkheteroaryl or substituted 6-26 membered 
— alkheteroaryl. -Genetically encoded aromatic amino acids include Phe, Tyr, and Tryp, 
while non-genetically encoded aromatic amino acids include phenylglycine, 2- 
napthylalanine, beta-2-thienylalanine, l,2,3,4-tetrahydro-isoquinoline-3-carboxylic 
acid, 4-chlorophenylalanine, 2-fluorophenylalanine3-fluorophenylalanine, and 4- 

20 fluorophenylalanine. 

An apolar amino acid is a hydrophobic amino acid with a side chain that is 
uncharged at physiological pH and which has bonds in which a pair of electrons shared 
in common by two atoms is generally held equally by each of the two atoms (i.e., the 
side chain is not polar). Genetically encoded apolar anoino acids include Gly, Leu, Val, 

25 lie, Ala, and Met, while non-genetically encoded apolar amino acids include 

cyclohexylalanine. Apolar arnino acids can be further subdivided to include aliphatic 
amino acids, which is ai hydrophobic amino acid having an aliphatic hydrocarbon side 
chain. Genetically encoded aliphatic amino acids include Ala, Leu, Val, and He, while 
non-genetically encoded aliphatic amino acids include norleucine. 

30 A polar amino acid is a hydrophilic amino acid with a side chain that is 

uncharged at physiological pH, but which has one bond in which the pair of electrons 
shared in common by two atoms is held more closely by one of the atoms. Genetically 
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encoded polar amino acids include Ser, Thr, Asn, and Gin, while non-genetically . 
encoded polar amino acids include citrulline, N-acetyl lysine, and methionine 
sulfoxide. 

An acidic amino acid is a hydrophilic amino acid with a side chain pKa value of 
5 less than 7. Acidic amino acids typically have negatively charged sid^ chains at 
physiological pH due to loss of a hydrogen ion. Genetically encoded acidic amino 
acids include Asp and Glu. A basic amino acid is a hydrophilic amino acid with a side 
chain pKa value of greater than 7. Basic amino acids typically have positively charged 
side chains at physiological pH due to association with hydronium ion. Genetically 

10 encoded basic amino acids include Arg, Lys, and His, while non-genetically encoded 
basic amino acids include the non-cyclic amino acids ornithine, 2,3,-diaminopropionic 
acid, 2,4-diaminobutyric acid, and homoarginine. 

It will be appreciated by one skilled in the art that the above classifications are 
not absolute and that an amino acid may be classified in more than one category. In 

IS addition, amino acids can be classified based on known behaviour and or characteristic 
cheniical, physical, or biological properties based on specified assays or as compared 
with previously identified amino acids. Amino acids can also include bifunctional 
moieties having amino acid-like side chains. 

Conservative changes can also include the substitution of a chemically 

20 derivatised moiety for a non-derivatised residue, by for example, reaction of a 
functional side group of an amino acid. Thus, these substitutions can include 
compounds whose free amino groups have been derivatised to amine hydrochlorides, p- 
toluene sulfonyl groups, carbobenzoxy groups, t-butyloxycarbonyl groups, chloroacetyl 
groups or formyl groups. Similarly, firee carboxyl groups can be derivatized to form 

25 salts, methyl and ethyl esters or other types of esters or hydrazides, and side chains can 
be derivatized to form 0-acyl or 0-alkyl derivatives for firee hydroxyl groups or N-im- 
benzylhistidine for the imidazole nitrogen of histidine. Peptide analogues also include 
amino acids that have been chemically altered, for example, by methylation, by 
amidation of the C*terminal amino acid by an alkylamine such as ethylamine, 

30 ethanolamine, or ethylene diamine, or acylation or methylation of an amino acid side 
chain (such as acylation of the epsilon amino group of lysine). Peptide analogues can 
also include replacement of the amide linkage in the peptide with a substituted amide 
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I 

(for example, groups of the fprmula -C(0)-NR, where R is (Ci-Ce) alkyl, (CrQ) 
alkenyl, (Ci-Ca) alkynyl, substituted (CrQ) alkyl, substituted (Ci-Ce) alkenyl, or 
substituted (CrCe) alkynyl) or isostere of an amide linkage (for example, -CH2NH-, - 
CH2S, -CH2CH2-, ^(ai=CH- (cis and trans), -C(0)CH2-, -CH(0H)CH2-, or -CH2SO-). 
5 The compound can be covalently linked, for example, by polymerisation or 

conjugation, to form homopolymers or heteropolymers. Spacers and linkers, typically 
composed of small neutral molecules, such as amino acids that are uncharged under 
physiological conditions, can be used. Linkages can be achieved in a number of ways. 
For example, cysteine residues can be added at the peptide termini, and multiple 

10 peptides can be covalently bonded by controlled oxidation. Alternatively, 

heterobifunctional agents, such as disulfide/amide forming agents or thioether/amide 
forming agents cauibe used. The compound can also be constrained, for example, by 
having cyclic portions. 

In some embodiments, three dimensional molecular modeling techniques may 

IS be used to identify or generate compounds that may be useful as therapeutics or 
diagnostics. Standard molecular modeling tools may be used, for example, those 
described in L-H Hung and R. Samudrala, PROTINFO: secondary and tertiary protein 
structure prediction, Nucleic Acids Research, 2003, Vol. 31, No. 13 3296-3299; A. 
Yamaguchi, et al / Enlarged FAMSBASB: protein 3D structure models of genome 

20 sequences for 41 species, Nucleic Acids Research, 2003, Vol. 31, No. 1 463-468; J. 
Chen, et al., MMDB: Entrez's 3D-structure database JNucleic Acids Research, 2003, 
Vol. 31, No. 1 474-477; R. A. Chiang, et al.. The Structure Superposition Database, 
Nucleic Acids Research, 2003, Vol. 31, No. 1 505-510. 

Peptides or peptide analogues can be synthesized by standard chemical 

25 techniques, for example, by automated synthesis using solution or solid phase synthesis 
methodology. Automated peptide synthesizers are conunercially available and use 
techniques well known in the art Peptides and peptide analogues can also be prepared 
using recombinant DNA technology using standard methods such as those described in, 
for example, Sambrook, et al. (Molecular Cloning: A Laboratory Manual. 2.sup.nd, ed., 

30 Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring 
Harbor, N.Y., 1989) or Ausubel et al. (Current Protocols in Molecular Biology, John 
Wiley & Sons, 1994), 
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Compounds, such as peptides (or analogues thereof) can be identified by routine 
experimentation by, for example, modifying residues within SARS peptides; 
introducing single or multiple amino acid substitutions, deletions, or insertions, and 
identifying those compounds that retain biological activity, e.g,^ those. compounds that 
5 have cytotoxic ability. I 

In general, candidate compounds for prevention or treatment of SARS virus- 
mediated disorders are identified from large libraries of both natural product or 
synthetic (or semi-synthetic) extracts or chemical libraries according to methods known 
in the art. Candidate or test compounds may include, without linoitation, peptides, 
10 polypeptides, synthesised organic molecules, naturally occurring organic molecules, 
and nucleic acid molecules. In some embodiments, such compounds screen for the 
ability to inhibit SARS virus replication or pathogenicity, while maintaining the 
infected cell's ability to grow or survive. 

Those skilled in the field of drug discovery and development will understand 
IS that the precise source of test extracts or compounds is not critical to the method(s) of 
the invention. Accordingly, virtually any number of chemical extracts or compounds 
can be screened using the exemplary methods described herein or using standard 

■ 

methods. Examples of such extracts or compounds include, but are not limited to, plant- 
, fungal-, prokaryotic- or animal-based extracts, fermentation broths, and synthetic 

20 compounds, as well as modification of existing compounds. Numerous methods are 
also available for generating random or directed synthesis (e.g., semi-synthesis or total 
synthesis) of any number of chemical compounds, including, but not limited to, 
saccharide-, lipid-, peptide-, and nucleic acid-based compounds. Synthetic compound 
libraries are commercially available. Alternatively, libraries of natural compounds in 

25 the form of bacterial, fungal, plant, and animal extracts are commercially available 
from a number of sources, including Biotics (Sussex, UK), Xenova (Slough, UK), 
Harbor Branch Oceanographic Institute (Ft Pierce, Ha.), and PharmaMar, U.S.A. 
(Cambridge, Mass.). In addition, natural and synthetically produced libraries of, for 
example, SARS virus polypeptides containing leader sequences, are produced, if 

30 desired, according to methods known in the art, e.g., by standard extraction and 
fractionation methods. Furthermore, if desired, any library or compound is readily 
modified using standard chemical, physical, or biochemical methods. 
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When a crude extract, is found to modulate cytotoxic!^ or viral infection, 
further fractionation of the ppsitive lead extract is necessary to isolate chemical 
constituents responsible for the observed effect. Thus, the goal of the extraction, 
fractionation, and purification process is the careful characterization and identification 

* 

5 of a chemical entity within the crude extract having, for example, anti-cytotoxicity or 
anti- viral properties. The same assays described herein for the detection of activities in 
mixtures of compounds can be used to purify the active component and to test 
derivatives thereof. Methods of fractionation and purification of such heterogenous 

* 

extracts are known in the art. If desired, compounds shown to be useful agents for 
10 treatment are chemically modified according to methods known in the art. Compounds 
identified as being of therapeutic, prophylactic, diagnostic, or other value in for 
example cell culture systems, such as a Vero E6 culture system, may be subsequentiy 
analyzed using a ferret animal model, or any other animal model suitable for analysis of 
SARS. 

15 

Antibodies 

The compounds of the invention can be used to prepare antibodies to SARS 
virus peptides, protein, polyproteins, or analogs thereof, or to SARS virus nucleic acid 
molecules or analogs thereof using standard techniques of preparation as, for example, 

20 described in Harlow and Lane (Antibodies; A Laboratory Manual, Cold Spring Harbor 
Laboratory, Cold Spring Harbor, N.Y., 1988), or known to those skilled in the art 
Antibodies may include polyclonal antibodies, monoclonal antibodies, hybrid 
antibodies (e.g., divalent antibodies having different pairs of heavy and light chains), 
chimeric antibodies (e.g., antibodies having constant and variable domains from 

25 different species and/or class), modified antibodies (e.g, antibodies in which the 

naturally occurring sequence has been altered by for example recombinant techniques), 
Fab antibodies, anti-idiotype antibodies, etc. Antibodies can be tailored to minimise 
adverse host immune response by, for example, using chimeric antibodies containing 
an antigen binding domain from one species and the Fc portion from another species, or 

30 by using antibodies made from hybridomas of the appropriate species. For example, 
"humanized" antibodies may be used for administration to humans. 
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To generate SARS virus polypeptide-specific antibodies, a SARS virus 
polypeptide coding sequence may be expressed, for example, as a C-terminal fusion 
with glutathione S-transferase (GST) (Smith et al.. Gene 67:31-40, 1988). The fusion 
polypeptide may then be purified on glutathione-Sepharose beads, .eluted vnlh 
5 glutathione cleaved with thrombin (at the engineered cleavage site(, and purified to the 
degree necessary for inmiunization of rabbits. Primary immunizations are carried out 
with Freud's complete adjuvant and subsequent immunizations with Freud's incomplete 
adjuvant Antibody titres are monitored by Western blot and immunoprecipitation 
analyzes using the thrombin-cleaved SARS virus polypeptide fi:agment of the GST- 

10 SARS virus fusion polypeptide. Immune sera are affinity purified using CNBr- 

Sepharose-coupled SARS virus polypeptide. Antiserum specificity is determined using 
a panel of unrelated GST polypeptides. 

As an alternate or adjunct immunogen to GST fusion polypeptides, peptides 
corresponding to relatively unique hydrophilic SARS virus polypeptides may be 

15 generated and coupled to keyhole limpet hemocyanin (KLH) through an introduced C- 
terminal lysine. Antiserum to each of these peptides is similarly affinity purified on 
peptides conjugated to BSA, and specificity tested in ELISA and Western blots using 
peptide conjugates, and by Western blot and immunoprecipitation using SARS virus 
polypeptide expressed as a GST fusion polypeptide. 

20 Alternatively, monoclonal antibodies may be prepared using the SARS virus 

polypeptides described above and standard hybridoma technology (see, e.g., Kohler et 
al., Nature, 256:495, 1975; Kohler et al., Eur. J Immunol. 6:51 1, 1976; Kohler et al., 
Eur. J. Immunol. 6:292, 1976; Hammerling et al.. In Monoclonal Antibodies and T Cell 
Hybridomas, Elsevier, NY, 1981; Ausubel et al., supra). Once produced, monoclonal 

25 antibodies are also tested for specific SARS virus polypeptide recognition by Western 
blot or immunoprecipitation analysis (by the methods described in Ausubel et al., 
supra). Antibodies which specifically recognize SARS virus polypeptides are 
considered to be useful in the invention; such antibodies may be used, e.g., in an 
immunoassay to monitor the level of SARS virus polypeptides produced by a mammal 

30 (for example, to determine the amount or location of a SARS virus polypeptide). 

In an alternative embodiment, antibodies of the invention are not only produced 
using the whole SARS virus polypeptide, but using fi-agments of the SARS virus 
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polypeptide which are unique or which lie outside highly conserved regions and appear 
likely to be antigenic, by .criteria such as high frequency of charged residues may also 
be used. In one specific example, such fragments are generated by standard techniques 
of PGR and clotifeiJ into the pGEX expression vector (Ausubel et al., supra). Fusion 
S polypeptides are expressed in E. coli and purified using a glutathione agarose affinity 
matrix as described in Ausubel et al. (supra). To attempt to minimize the potential 
problems of low affinity or specificity of antisera, two or three such fusions are 
generated for each polypeptide, and each fusion is injected into at least two rabbits. 
Antisera are raised by injections in a series, preferably including at least three booster 
10 injections. SARS virus antibodies may also be prepared against SARS virus nucleic 
acid molecules. 

Antibodies may be used as diagnostics, therapeutics, or prophylactics for SARS 
virus-related disorders. Antibodies may also be used to isolate SARS virus and 
compounds by for example affinity chromatography, or to identify SARS virus 
IS compounds isolated or generated by other techniques. 

Arrays and Libraries 

Jn some aspects, biological assays, such as diagnostic or other assays, using 
high density nutleic acid, polypeptide, or antibody arrays, for example high density 

20 miniaturized arrays or "microarrays," of SARS virus nucleic acid molecules or 

polypeptides, or antibodies capable of specifically binding such nucleic acid molecules 
or polypeptides, may be performed. Macroarrays, performed for example by manual 
spotting techniques, may also be used. Arrays generally require a solid support (for 
example, nylon, glass, ceramic, plastic, silicon, nitrocellulose or PVDF membranes, 

25 microwells, microbeads, e.g., magnetic microbeads, etc.) to 

which the nucleic acid molecules or polypeptides or antibodies are attached in a 
specified two-dimensional arrangement, such that the pattern of hybridization is easily 
determinable. Suspension arrays (particles in suspension) that are coded to facilitate 
identification may also be used. SARS virus nucleic acid molecules or polypeptide 

30 probes or targets may be compounds as described herein* 

In some embodiments, high density nucleic acid arrays may for example be 
used to monitor the presence or level of expression of a large number of SARS virus 
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nucleic acid molecules or genes or for detecting or identifying SARS virus nucleic add 
sequence variations, mutations or polymorphisms. For the purpose of such arrays, 
"nucleic acids" may include any polymer or oligomer of nucleosides or nucleotides 
(polynucleotides or oligonucleotides), which include pyrimidine and purine bases, 
5 preferably cytosine, thymine, and uracil, and adenine and guanine, Irespectively, or may 
include peptide nucleic acids (PNA). In an alternative aspect, the invention provides 
nucleic acid naicroarrays including a number of distinct nucleic acid sequence arrays of 
the invention, thus providing specific "sets" of sequences. The number of distinct 
sequences may for example be any integer between 2 and 1 x 10^, such as at least 10^, 

10 10^ 10\ or 10^ 

The invention also provides gene knockout and expression libraries. Thus, 
nucleic acid molecules encoding SARS virus polypeptides or proteins (e.g., PCR 
products of ORF's or total mRNA) may for example be attached to a solid support, 
hybridized with single stranded detectably-labeled cDNAs (corresponding to an 

IS "antisense" orientation), and quantified using an appropriate method such that a signal 
-is detected at each location-at which hybridization has taken place. The intensity of the 
signal would then reflect the level of gene expression. Comparison of results fix)m 
viruses, for example, of different strains or from different samples or subjects, would 
elucidate differing levels of expression of specified genes. Using similar techniques, 

20 homologous nucleic acids may be identified firom different viruses if SARS virus 

nucleic acids are used in the microarray, and probed with nucleic acid molecules ft"om 
different viruses or subjects. In some embodiments, this approach may involve 
constructing his-tagged ORF expression libraries of viral genomes in a bacterial host, 
similar to an expression library in yeast (Martzen M. R. et al., 1999. Science, 

25 286: 11 53). ORF-encoded protein activities may for example be detected in purified his- 
tagged protein pools in cases where activities cannot be detected in extracts or cells. In 
one aspect of the invention, arrayed libraries may be constructed of viral strains each of 
which bears a plasmid expressing a different SARS virus ORF under control of an 
inducible promoter. ORFs are amplified using PCR and cloned into a vector that 

30 enables their expression as N-terminal his-tagged polypeptides. These amplicons are 
also used to construct hybridization microairays and enable targeted gene disruption, 
reducing expenses. A suitable expression host is selected, and genes encoding 
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particular biochemical activities are identified by screening arrayed pools of his-tagged 
proteins as described previously (Martzen M. R., McCraith S, M., Spinelli S.L., Torres 
F. M., Fields S., Grayhack E.J., and Phizicky E. M., 1999. Science, 286:1153). 

In some 'Embodiments, protein arrays (including antibody or antigen arrays) 
5 may be used for the analysis and identification of S ARS virus polypeptides or host 
responses to such polypeptides. Thus, protein arrays naay be used to detect SARS virus 
polypeptides in a patient; distinguish a SARS virus polypeptide from a host 
polypeptide; detect interactions between SARS virus polypeptides and for example host 
proteins; determine the efficacy of potential therapeutics, such as small molecules or 

10 ligands that may bind SARS virus polypeptides; determine protein-antibody 

interactions; and/or detect the interaction of enzyme-substrate interactions. Protein 
arrays may also be used to detect SARS virus antigens and antibodies in samples; to 
profile expression of SARS virus polypeptides; to identify suitable antibodies or map 
epitopes; or for a variety of protein function analyses. 

IS A variety of methods are known for making and using microarrays, as for 

example disclosed in Cheung V. G., et al., 1999. Nature Genetics Supplement, 21;15- 
19; Lipshutz R. L, etaL,1999. Nature Genetics Supplement, 21:20-24; Bowtell D. D. 
L., 1999. Nature Genetics Supplement, 21:25-32; Singh-Gasson S., et al, 1999. 
Nature BiotechAoL, 17:974-978; and Schweitzer B., et a/., 2002. Nature BiotechnoL, 

20 20:359-365. Thus, for example, microarrays may be designed by synthesizing 

oligonucleotides with sequence variations based on a reference sequences, such as any 
SARS virus sequences described herein. Methods for storing, querying and analyzing 
microarray data have for example been disclosed in, for example, United States Patent 
No. 6,484,183; United States Patent No. 6.188,783; and Holloway A. J., et al, 2002. 

25 Nature Genetics Supplement, 32:481-489. Protein arrays may be constructed, detected, 
and analysed using methods known in the art for example mass spectrometric 
techniques, immunoassays such as ELISA and western (dot) blotting combined with for 
example fluorescence detection techniques, and adapted for high throughput analysis, 
as described in for example MacBeath, G. and Schreiber, S.L. Science 2000, 289, 1760- 

30 1763; Levit-Biimun N, et al. (2003) Quantitative detection of protein arrays. Anal 

Chem 75:1436-41; Kukar T, et al. (2002) Protein microarrays to detect protein-protein 
interactions using red and green fluorescent proteins. Anal Biochem 306:50-4; 
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Borrebaeck CA, et al. (2001) Protein chips based on -recombinant antibody fragments: 
a highly sensitive approach as detected by mass spectrometry. Biotechniques 30:1126- 
1 132; Huang RP (2001) Detection of multiple proteins in an antibpdy-based protein 
microarray system. J Immunol Methods 255:1-13; Emili AQ and Cagney G (2000) 
5 Large-scale functional analysis using peptide or protein arrays. Natlxre Biotechnol 
18:393-397; Zhu H, et al. (2000) Analysis of yeast protein kinases using protein chips. 
Nature Genet 26:283-9; Lucking A, et al. (1999) Protein Microarrays for Gene 
Expression and Antibody Screenmg. Anal. Biochem. 270:103-1 11; or Templin MF, et 
al. (2002) Protein microarray technology. Drug Discov Today 7:815-822. Tools for 
10 microarray techniques are available commercially from for example Affymetrix, Santa 
Clara, CA; Nanogen, San Diego, CA; or Sequenom, San Diego, CA. 

Computer Readable Records 

Nucleic acid and polypeptide sequences, as described herein, or a fragment 

15 thereof, may be provided in a variety of media to facilitate access to these sequences 
and enable the use thereof. According, S ARS virus nucleic acid and polypeptide 
sequences of the invention may be recorded or stored on computer readable media, 
using any technique and format that is appropriate for the particular medium. 

In alternative embodiments, the invention provides computer readable media 

20 encoded with a number of distinct nucleic acid or amino acid data sequences of the 

invention. The number of distinct sequences may for example be any integer between 2 
and 1 X 10^, such as at least 10^, 10^, 10"^, or 10^. In one embodiment, the invention 
features a computer medium having a plurality of digitally encoded data records. Each 
data record may include a value representing a nucleic acid or amino acid sequence of 

25 the invention. In some embodiments, the data record may further include values 
representing the level of expression, level or activity of a nucleic acid or amino acid 
sequence of the invention. The data record can be structured as a table, for exanaple, a 
table that is part of a database such as a relational database (for example, a SQL 
database of the Oracle or Sybase database environments). The invention also includes a 

30 method of communicating information about a sample, for example by transmitting 
information, for example transmitting a computer readable record as described herein, 
for example over a computer network. The polypeptide and nucleic acid sequences of 
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the invention, and sequen9e infonnation pertaining thereto, naay be routinely accessed 
by one of ordinary skill in the art for a variety of purposes, including for the purposes 
of comparing substantially identical sequences, etc. Such access may be facilitated 
using publicly available software as described herein. By "computer readable media'' 

« 

S is meant any medium that can be read and accessed directly by a computer. Such media 
include, but are not limited to: magnetic storage media, such as floppy discs, hard disc 
storage medium, and magnetic tape; optical storage media such as CD-ROM; electrical 
storage media such as RAM and ROM; and hybrids of these categories such as 
magnetic/optical storage media. 

10 

Pharmaceutical and Veterinary Compositions, Dosages. And Administration 

* 

Compounds of the invention can be provided alone or in combination with other 
compounds (for example, small molecules, peptides, or peptide analogues), in the 
presence of a liposome, an adjuvant, or any pharmaceutically acceptable carrier, in a 
IS form suitable for administration to humans or to animals. 

Conventional pharmaceutical practice may be employed to provide suitable 
formulations or compositions to administer the compounds to patients suffering from or 
presymptomatic for SARS. Any appropriate route of administration may be employed, 

for example, parenteral, intravenous, subcutaneous, intramuscular, intracranial, 

- 

20 intraorbital, ophthalinic, intraventricular, intracapsular, intraspinal, intracistemal, 
intraperitoneal, intranasal, aerosol, or oral administration. In some embodiments, 
compounds are delivered directly to the lung, by for example, formulations suitable for 
inhalation. In some embodiments, gene therapy techniques may be used for 
administration of SARS virus nucleic acid molecules, for example, as DNA 

25 vaccines.Formulations may be in the form of liquid solutions or suspensions; for oral 
administration, formulations may be in the form of tablets or capsules; and for 
intranasal formulations, in the form of powders, nasal drops, or aerosols. 

Methods well known in the art for making formulations are found in, for 
example, **Remington*s Pharmaceutical Sciences'* (18* edition), ed. A. Gennaro, 1990, 

30 Mack Publishing Company, Easton, Pa. Formulations for parenteral administration 
may, for example, contain excipients, sterile water, or saline, polyalkylene glycols such 
as polyethylene glycol, oils of vegetable origin, or hydrogenated napthalenes. 
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Biocompatible, biodegradable lactide polymer, lacdde/glycolide copolymer, or 
polyoxyethylene-polyoxypropylene copolymers may be used to control the relefase of 
the compounds. Other potentially useful parenteral delivery systenjs for modulatory 
compounds include ethylene-vinyl acetate copolymer particles, osmotic pumps, 
S implantable infusion systems, and liposomes. Formulations for inh^ation may contain 
excipients, for example, lactose, or may be aqueous solutions containing, for example, 

■ 

polyoxyethylene-9-lauryl ether, glycocholate and deoxycholate, or may be oily 
solutions for administration in the form of nasal drops, or as a gel. 

If desired, treatment with a compound according to the invention may be 
10 combined with more traditional therapies for the disease. 

For therapeutic or prophylactic compositions, the compounds are administered 

• * 

to an individual in an amount sufficient to stop or slow the replication of the SARS 
virus, or to confer protective immunity against future SARS virus infection. Amounts 
considered sufficient will vary according to the specific compound used, the mode of . 

IS administration, the stage and severity of the disease, the age, sex, and health of the 
individual being treated, and concurrent treatments. As a general rule, however, 
dosages can range from about Ifig to about 100 mg per kg body weight of a patient for 
an initial dosage, with subsequent adjustments depending on the patient's response, 
which can be measured, for example by determining the presence of SARS nucleic acid 

20 molecules, polypeptides, or virions in the patienfs peripheral blood. 

In the case of vaccine formulations, an immunogenically effective amount of a 
compound of the invention can be provided, alone or in combination with other 
compounds, with an adjuvant, for example, Freund's incomplete adjuvant or aluminum 
hydroxide. The compound may also be linked with a carrier molecule, such as bovine 

25 serum albumin or keyhole limpet hemocyanin to enhance immunogenicity. 

In general, compounds of the invention should be used without causing substantial 
toxicity. Toxicity of the compounds of the invention can be determined using standard 
techniques, for example, by testing in cell cultures or experimental animals and 
determining the therapeutic index, i.e., the ratio between the LD50 (the dose lethal to 

30 50% of the population) and the LDIOO (the dose lethal to 100% of the population). In 
some circumstances however, such as in severe disease conditions, it may be necessary 
to administer substantial excesses of the compositions. 
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Vims Isolation 

Virus isolation was performed on a bronchoaveolar lavage specimen of a fatal 
S ARS case belonging to tiie original case cluster from Toronto, Canada. All .work with 

« 

5 tiie infectious agent was performed in a biosafety level 3 (BSL3) laboratory using a 
NlOO mask for personal protection. Samples were removed from BSL3 after addition 
of tht RNA extraction buffer. The virus isolate, named the 'Tor2 isolate" was grown 
in African Green Monkey Kidney (Vero E6) cells, the viral particles were purified, and 
die genetic material (RNA) was extracted from the Tor2 isolate (Poutanen, S. M. et al„ 

10 N Engl J Med, Apr 10, 2003). More specifically, one hundred microlitre specimens 
were used to inoculate Vero E6 cells (ATCC CRL 1586) on Dulbecco's Modified 
Eagle Medium supplemented with penicillin/ streptomycin, glutamine and 2% fetal calf 
serum. The culture was incubated at 3TC. Cytopathogenic effect was observed 5 days 
post inoculation. The virus was passaged into newly seeded Vero E6 cells which 

15 showed a cytopathogenic effect as early as 2 days post infection (multiplicity of 

infection 10-^). A virus stock was prepared from passage 2 of these cells and preserved 
in liquid nitrogen. The titer of the virus stock was determined to be 1x10^ plaque 
forming units (p.f.u.) by plaque assay and 5 x 10^ by tissue culture infectious dose 

CrciD)50. 

20 For virus propagation, 1 0 x T- 1 62 flasks of Vero E6 cells were infected with a 

multiplicity of infection of 10"^. When infected cells showed a cytopatiiognic effect of 
*4+' (48 hours post infection), the cultures were then frozen and thawed to lyse the 
cells, and the supematants were clarified from cell debris by centrifugation at 10,000 
rpm in a Beckman high-speed centrifuge. The supematants were treated with DNAse 

25 and RNAse for 3 hours at 3TC to remove any cellular genomic nucleic acids and 
subsequentiy extracted with an equal volume of 1,1,2-trichloro-trifluoroethane. The 
top fraction was ultra-centrifuged through a 5% / 40% glycerol step gradient at 151,000 
X g for 1 hour at 4**C. The virus pellet was resuspended in PBS. RNA was isolated 
using a commercial kit fi:x)m QIAGEN and stored at -80*'C for further use. 

30 

cDNA Librarv Construction 



t 



wo 2004/096842 PCT/CA2004/000626 

60 

The RNA and subsequent products were handled under biosafety level 2 
(BSL2) conditions. The RNA sample was converted to a cDNA library, using a 
combined random-priming and 61igo-dT priming strategy, and resultant subgenomic 
clones were processed under level 1 biosafety conditions. More specifically, purified 

5 viral RNA (55 ng) was used in the construction of a random primed'and oligo^T 
primed cDNA library, using the Superscript Choice System for cDNA synthesis 
(Invitrogen). Linkers 5' -AATTCGCGGCCGCGTCGAC-3', SEQ ID NO: 195, and 
5'-pGTCGACGCGGCCGCG-3' , SEQ ID NO: 196, were ligated following cDNA 
synthesis. The cDNA synthesis products were visualized on agarose gels, revealmg the 

10 anticipated low-yield smear. To produce sufficient cDNA for cloning, the cDNA 

product was size fractionated on a low-melting point preparative agarose gel, followed 
by PGR amplification using a single PGR primer 5' AATTCGCGGCCGCGTCGAC-3\ 
SEQ ED NO: 197, specific to the linkers. This yielded sufficient material for cloning. 
Size-selected cDNA products were cloned and single sequence reads were 

15 generated fi-om each end of the insert from randomly picked clones. A list of the SARS 
- virus clones is provided in the accompanying sequence listing, which is incoiporated 
by reference herein (SEQ ID NOs: 92-159, 208 and 209). 

More specifically, size-selected cDNAs were ligated into the pCR4-T0P0 TA 
cloning vector (Invitrogen, CA), or after digestion with the restriction nuclease Not I 

20 into the pBR194c vector (The Institute for Genoniic Research, Rockville,MD, USA). 

Ligated clones were then transformed by electroporation into DHIOB Tl cells 
(Invitrogen), plated on 22 cm agar plates with the appropriate antibiotic and grown for 
16 hours at 3TC. Colonies were picked into 3 84- well Axygen culture blocks 
containing 2 X YT media and grown in a shaking incubator for 18 hours at 37°C. Cells 
25 were lysed and DNA purified using standard laboratory procedures. Sequencing 
primers for the 194c clones were 5'-GGGCTCTTCGCTATTACGC-3' (forward 
primer) and 5' TGCAGGTCGACTCTAGAGGAT-3' (reverse primer). 

DNA Sequencing And Assembly Of Reads 
30 Sequences were assembled and the assembly edited to produce the genomic 

sequence of the SARS virus. More specifically, DNA sequencing of both ends of the 
plasmid templates was achieved using Applied Biosystems BigDye terminator reagent 
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(version 3), with electrophoresis and data collection on AB 3700 and 3730 XL 

t 

I 

instruments DNA sequence reads were screened for non-viral contaminating 
sequences, trimmed for quality using PHRED (Ewing, B, and P. Green, Genome Res 8, 
186-94, Mar, 1998) and assembled using PHRAP (Gordon, D. et al..Genome.Res 8, 
5 195-202, Mar, 1998). Simultaneously, sequences were used in BLAST searches of 

viral nucleotide and non-redundant protein datasets (NCBI, National Library of ' 
Medicine) to search for similarities. Sequence assemblies were visualized using 
CONSED (Gordon, D. et al. Genome Res 8, 195-202, Mar, 1998). Sequence mis- 
assemblies and contig joins were identified using Miropeats (Parsons, J. D., Comput 
10 Appl Biosci 11, 615-9 (Dec, 1995). As sequence data accrued, the additional sequences 
were assembled until it became apparent that the additional depth of sampling was 
increasing depth* of coverage but not extending the length of the contig. At this point, 
3,080 sequencing reads were generated, 2,634 of which were assembled into a single 
large contig. 

15 The sequence information was imported into an ACEDB database (Durbin, J. 

Thierry-Mieg. 1991-. A C. elegans Database. Documentation, code and data available 
from anonymous FTP servers at lirmm.lirmm.fr, cele.mrc-lmb.cam.ac.uk and 
ncbi.nlm.nih.gov) and subjected to biological analysis including the identification of 
open reading frdmes, detection of similar sequences by BLAST and searching for 

20 apparent frameshifts. When frameshifts were identified by this analysis, the sequence 

* 

assembly was consulted for evidence of sequencing errors and if found, they were 
corrected. The sequences were also searched for any that could extend the 5' end of the 
sequence and these were incorporated when found. High quality sequence 
discrepancies between different sequence reads were identified and resolved. Sequence 

25 reads classified as deleted or chimeric were identified through manual inspection and 
removed from the assembly. The resulting sequence has an average PHRED consensus 
quality score of 89.96. The lowest quality bases in the assembly are in the inunediate 
vicinity of the 5' and 3' ends of the viral genome, with the lowest quality base having a 
PHRED score of 35. Most (29,694 of the 29,736 (99.86%)) of the bases have a 

30 consensus score of 90. Almost all regions of the genome are represented by reads 

derived from both strandis of the plasmid sequencing templates, the exceptions being 50 
bases at the 5' end represented by a single sequencing read, and 5 bases at the 3' end 
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represented by a single read. The average base in the assembly is represented by 30 
reads in the forward direction and 30 reads in die reverse direction, as determined by 
PHRED. RT-PCR products predicted from the sequence and spaqning the entire 
genome yield PCR products of the anticipated size on agarose gels. To confirm the S* 
5 end of the viral genome RACE was performed using the RLM-RAfcE kit from Ambion» 
and primers 5*-CAGGAAACAGCTATGACACCAAGAACAAGGCTCTCCA-3' 
(SEQ ED NO: 90) and 5'- 

CAGGAAACAGCTATGACGATAGGGCCTCTTCCACAGA-3' (SEQIDNO: 91). 

Fourteen clones were recovered and sequenced. Analysis of these sequences confirmed 
10 the 5' end of the coronavirus genome. The SARS genomic sequences have been 
deposited into Genbank (Accession Nos. AY274119.1, AY274119.2, and 
AY2741193). 

While the invention has been described in connection with specific 
IS embodiments thereof, it will be understood that it is capable of further modifications 
and this application is intended to cover any variations, uses, or adaptations of the 
invention following, in general, the principles of the invention and including such 
departures from thie present disclosure that come within known or customary practice 
within the art to which the invention pertains, and may be applied to the essential 
20 features set forth herein and in the scope of the appended claims. 

All patents, patent applications, and publications referred to herein are hereby 
incorporated by reference in their entirety to the same extent as if each individual 
patent, patent application, or publication was specifically and individually indicated to 
be incorporated by reference in its entirety. 
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What is claimed is: 

1. A substantially pure SARS virus nucleic acid molecule, 

t 

5 2. The molecule of claim 1, wherein said molecule is selected from the group 
consisting of genomic RNA or DNA, cDNA, synthetic DNA, or mRNA. 

■ 

3. The molecule of claim 1 or 2, wherein said molecide comprises a sequence 
substantially identical to a sequence selected from the group consisting of SEQ ID 

10 NOs: M3. 15-18, 20-30, 90-159, 208, and 209 or a fragment thereof. 

4. The molecule of claim 3, wherein said molecule comprises a sequence selected 
from the group consisting of SEQ ID NO: 1, SEQ ID NO:2, and SEQ ID NO: 15 or a 
fragment thereof. 

15 

5. The molecule of claim 3, wherein said molecule comprises a sequence 
substantially identical to a sequence selected from the group consisting of SEQ ID NO: 
1, SEQ ID N0:2, and SEQ ID NO: 15, or a fragment tiiereof. 

•I 

20 6. The molecule of any one of claims 1 through 3, wherein said molecule 
comprises a s2m motif. 

7. The molecule of claim 6, wherein said s2m motif comprises a sequence 
substantially identical to a sequence selected from the group consistimg of SEQ ID 

25 NOs: 16, 17. and 18. 

8. The molecule of any one of claims 1 through 3, wherein said molecule 
comprises a leader sequence. 



30 



9. The molecule of claim 8, wherein said leader sequence comprises a sequence 
substantially identical to the sequence of SEQ ID NO: 3. 
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10. The molecule of any one of claims 1 through 3, wherein said molecule . 
comprises a transcriptional regulatory sequence. 

11. The molecule of claim 10, wherein said transcriptional regulatory sequence 
5 comprises a sequence substantially identical to the sequence selected from the group 

consisting of SEQ ID NOs: 4-13 and 20-30. 

12. The molecule of claim 1 , wherein said molecule comprises a sequence 
substantially identical to a sequence selected from nucleotides 265-13,398; 13,398- 

10 21,485; 21,492 - 25,259; 25,268 ^ 26,092; 25,689 - 26,153; 26,1 17 - 26,347; 26.398 - 
27,063; 27,074 - 27,265; 27,273 - 27,641; 27,638 - 27,772; 27,779 - 27,898; 27,864 - 
28,118; 28,120 - 29,388; 28,130 - 28,426; 28,583 - 28,795; and 29,590 - 29,621 of 
SEQ ID NO: 15. 

15 13. The molecule of any one of claims 1 through 3, wherein said molecule encodes 
a polyprotein. 

14. The molecule of any one of claims 1 through 3, wherein said molecule encodes 
a polypeptide. 

20 

15. A substantially pure SARS virus polypeptide. 

16. The polypeptide of claim 15, wherein said polypeptide comprises a polyprotein. 

25 

17. The polypeptide of claim 15, wherein said polypeptide comprises an identifiable 
signal sequence. 

18. The polypeptide of claim 17, wherein said signal sequence comprises a 

30 sequence substantially identical to a sequence selected from the group consisting of 
SEQ ID NOs: 76 and 85. 
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• t 

19. The polypeptide qf claim 15, wherein said polypeptide comprises a 
transmembrane domain. . 

20. The polypeptide of claim 19, wherein said transmembrane domain cpmprises a 

4 

S sequence substantially identical to a sequence selected from the group consisting of 
SEQ ID NOs; 77-86. 

21. The polypeptide of claim 15, wherein said polypeptide comprises a 

i 

glycoprotein. 

10 

22. The polypeptide of claim 21, wherein said glycoprotein comprises a matrix 
glycoprotein. . . 

23. The polypeptide of claim 22, wherein said matrix glycoprotein comprises a 
15 sequence substantially identical to SEQ ID NO: 34. 

24. • The polypeptide of claim 15, wherein said polypeptide is selected from the 
group consisting of a transmembrane protein and a multitransmembrane protein. 

20 25. The polypeptide of claim 15, wherein said polypeptide is selected from the 
group consisting of a type I transmembrane protein and a type 11 transmembrane 
protein. 

26. The polypeptide of claim 24, wherein said polypeptide comprises a 
25 transmembrane anchor or a a transmembrane helix. 

27. The polypeptide of any one of claims 1 through 3, wherein said polypeptide 
comprises an epitope of a SARS virus 

28- The polypeptide of claim 15, wherein said polypeptide comprises an ATP- 
30 binding domain. 
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29. The polypeptide of claim 15, wherein said polypeptide comprises a viral 
envelope protein. 

30. The polypeptide of claim IS, wherein said polypeptide coni^rises a nuclear 
S localization signal. i 

31. The polypeptide of claim 15, wherein said polypeptide comprises a lysine-rich 
sequence. 

10 32. The polypeptide of claim 31, wherein said lysine-rich sequence comprises a 
sequence substantially identical to SEQ ID NO: 14. 

33 The polypeptide of claim 15, wherein said polypeptide comprises a RNA 
binding protein. 

15 

34. The polypeptide of claim 15, wherein said polypeptide comprises a hydrophilic 
domain. 

35. The polypeptide of claim 34, wherein said hydrophilic domain comprises a 
20 sequence substantially identical to SEQ ID NO: 87. 

36. The polypeptide of claim 15, wherein said polypeptide is selected from the 
group consisting of replicase la, replicase lb, spike glycoprotein, small envelope 
protein, matrix glycoprotein, and nucleocapsid protein. 

25 

37. The polypeptide of claim 15, wherein said polypeptide comprises a sequence 
substantially identical to a sequence selected from the group consisting of SEQ ID 
NOs: 14, 33-36, 64-74, and 76-87 or a fragment thereof. 



30 38. A vector comprising the nucleic acid molecule of claim 1 
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39. The vector of claim 3S, wherein said vector comprises a sequence substantially 

« 

identical to a sequence selected from the group consisting of SEQ ID NOs: 1-13, 15-18, 
20-30, 90-159, 208, and 209. 

S 40. The vector of claim 38, wherein said vector is a gene therapy vector. 

41. A host cell comprising the vector of claim 38. 

42. The host cell of claim 41, wherein said cell is selected from the group consisting 
10 of a manomalian cell, a yeast, a bacterium, and a nematode cell. 

43. A nucleic acid molecule having substantial nucleotide sequence identity to a 
sequence encoding a S ARS virus polypeptide or fragment thereof, wherein said 
fragment comprises at least six amino acids, and wherein said nucleic acid molecule 

15 hybridizes under high stringency conditions to at least a portion of a SARS virus 
nucleic acid molecule. 

* 

44. The nucleic acid molecule of claim 43, wherein said nucleic add molecule has 
100% sequence' complementarity to said sequence encoding a SARS virus polypeptide 

20 or fragment thereof. 

45. A nucleic acid molecule having substantial nucleotide sequence identity to a 
SARS virus nucleotide sequence, wherein said nucleic acid molecule comprises at least 
ten nucleotides, and wherein said nucleic acid molecule hybridizes under high 

25 stringency conditions to at least a portion of a SARS virus nucleic acid molecule. 

■ 

46. The nucleic acid molecule of claim 45, wherein said nucleic acid molecule has 
100% sequence complementarity to said SARS virus nucleotide sequence. 

30 47. A nucleic acid molecule comprising a sequence that is antisense to a SARS 
virus nucleic acid molecule 



I 
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48. An antibody that specifically binds to a SARS virus polypeptide. 

49. The antibody of claim 48, wherein said antibody is a neutralizing antibody. 

S SO. A method for detecting a SARS virus virion or polypeptidelin a sample, said 
method comprising contacting said sample with the antibody of claim 48, and 
determining whether said antibody specifically binds to said polypeptide. 

51 . A method for detecting a SARS virus genome or gene or homolog or fragment 
10 thereof in a sample, said method comprising contacting a SARS virus nucleic acid 

molecule, wherein said nucleic acid molecule comprises at least ten nucleotides, with a 
preparation of DNA from said sample, under hybridization conditions providing 
detection of DNA sequences having nucleotide sequence identity to a SARS virus 
nucleic acid molecule. 

15 

52. The method of claim 31, wherein said nucleic acid molecule comprises at least . 
one of a primer pair, wherein said primer pair hybridizes to said a SARS virus genome 
or gene or homolog or fragment thereof under conditions suitable for polymerase chain 
reaction. 

20 

53 . A method of targeting a protem for secretion from a cell, said method 
comprising attaching a signal sequence from a SARS virus polypeptide to said protein, 
such that said protein is secreted from said cell. 

25 54. A nucleic acid molecule comprising a sequence complementary to a SARS 
virus nucleotide sequence. 

55. A kit for detecting the presence of a SARS virus nucleic acid molecule or 
polypeptide in a sample, said kit comprising a reagent selected from the group 
30 consisting of a SARS virus nucleic acid molecule and an antibody that specifically 
binds a SARS virus polypeptide. 
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56. A method for eliciting an immune response in an animal, said method 
comprising identifying an animal infected with or at risk for infection with a S ARS 
vims, and administering a S ARS virus polypeptide or fragment thereof, or 
administering a 'S ARS virus nucleic acid molecule encoding a S ARS virus polypeptide 

■ 

S or fragment thereof, to said animal. 

57. The method of claim 56, wherein said administering results in the production of 

♦ 

an antibody in said animal. 

10 58. The method of claim 56, wherein said administering results in the generation of 
cytotoxic or helper T-lymphocytes in said animal. 

♦ • 

59. A method for treating or preventing a SARS virus infection comprising 
identifying an animal infected with or at risk for infection with a SARS virus, and 

15 administering a SARS virus nucleic acid molecule or polypeptide, or administering a 
compound that inhibits pathogenicity or replication of a SARS virus, to the animal. 

60. The method of claim 59, wherein the animal is a hmnan. 

h 

20 61. Use of a SARS virus nucleic acid molecule or polypeptide for treating or 

« • 

preventing a SARS virus infection, 

62. A method of identifying a compound for treating or preventing a SARS virus 
infection, comprising contacting sample comprising a SARS virus nucleic acid 

25 molecule or contacting a SARS virus polypeptide with the compound, wherein an 
increase or decrease in the expression or activity of the nucleic acid molecule or the 
polypeptide identifies a compound for treating or preventing a SARS virus infection. 

63. A vaccine comprising a SARS virus nucleic acid molecule or polypeptide. 

30 

64. The vaccine of claim 62, wherein the vaccine is a DNA vaccine. 



SUBSTITUTE SHEET (RULE 26) 



wo 2004/096842 PCT/CA2004/000626 

70 

65. A microarray comprising a plurality of elements, wherein each element 
comprises one or more distinct nucleic acid or amino acid sequences, and wherein the 
sequences are selected from a S ARS virus nucleic acid molecule pr polypeptide, or a 
antibody that specifically binds a SARS virus nucleic acid molecule or polypeptide. 

5 I 

66. A computer readable record comprising distinct SARS virus nucleic acid or 
amino acid sequences. 

67. The computer readable record of claim 65, wherein the computer readable 
10 record comprises a database. 
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CTACCCAGGAAAAGCCAACCAACCTCGATCTCTTGTAGATCTGTTCTCTAAACGAACTTTAAAATC^ 

AGCTGTCGCTCGGCTGCATGCCTAGTGCACCTACGCAGTATAAACAATAATAAATTTTACTGTCGTTCACA 
AGAAACGAGTAACTCGTCCCTCTTCTGCAGACTGCTTACGGTTTCGTCCGTGTTGCAGTCGATCATCAGCA 
TACCTAGGTTTCGTCCGGGTGTGACCGAAAGGTAAGATGGAGAGCCTTGTTCTTGGTGTCAACGAGAAAAC 
ACACGTCCAACTCAGTTTGCCTGTCCTTCAGGTTAGAGACGTGCTAGTGCGTGGCTTCGGGGACTCTGTGG 



AAGAGGCCCTATCGGAGGCACGTGAACACCTCAAAAATGGCACTTGTGGTCTAGTAGAGCTGGAAAAAGGC 

GTACTGCCCCAGCTTGAACAGCCCTATGTGTTCATOAAACGTTCTGATGCCTTAAGCACCAAT 

CAAGGTCGTTGAGCTGGTTGCAGAAATGGACGGCATTCAGTACGGTCGTAGCGGTApAACAC^^ 

TCGTGCCACATGTGGGCGAAACCCCAATTGCATACCGCAATGTTCTTCTTCGTAAGAACGGTAATAAGGGA 

GCCGGTGGTCATAGCTATGGCATCGATCTAAAGTCTTATGACTTAGGTGACGAGCTTGGCACTGATCCCAT 

TGAAGATTATGAACAAAACTGGAACACTAAGCATGGCAGTGGTGCACTCCGTGAACTCACTCGTGAGCOXyV 

ATGGAGGTGCAGTCACTCGCTATGTCGACAACAATTTCTGTGGCCCAGATGGGTACCCTCTTGATTGC^ 

AAAGATTTTCTCGCACGCGCGGGCAAGTCTUVTGTGCACTCTTTCCGAAO^CTT^ 

GAGAGGTGTCTACTGCTGCCGTGACCATGAGCATGAAATTGCCTGGTTCACTGAGCGCTCTGAT^^ 

ACGAGCACCAGACACCCTTCGAAATTAAGAGTGCCAAGAAATTTGACACTTTCA^ 

TTTGTGTTTCCTCTTAACTCAAAAGTCAAAGTCATTCAACCACGTGTTGAAAAGAAAAAGACTGAGGGT^ 

CATGGGGCGTATACGCTCTGTGTACCCTGTTGCATCTCCACAGGAGTGTAACAATATGCACTTGTCTACC 

TGATGAAATGTAATCATTGCGATGAAGTTTCATGGCAGACGTGCGACTTTCTGATAGCCACTTGTGJ^ 

TGTGGCACTGAAAATTTAGTTATTGAAGGACCTACTACATGTGGGTACCTACCTACTAATGCTGTAGTGAA 

AATGCCATGTCCTGCCTGTCAAGACCCAGAGATTGGACCTGAGCATAGTGTTGCAGATTATCACAACC^ 

CAAACATTGAAACa?CGACTCCGCAAGGGAGGTAGGACTAGATGTTTTGGAGGCTGTC 

GGCTGCTATAATAAGCGTGCCTACTGGGTTCCTCGTGCTAGTOCTGATATTGGCTCAGGCC^ 

TACTGGTGACAATGTGGAGACCTTGAATGAGGATCTCCTTGAGATACTGAGTCGTGAACGTGTTAACATTA 

ACATTGTTGGCGATTTTCATTTGAATGAAGAGGTTGCCATCATTTTGGCATCTTTCTCTGCTTCTAC7VAGT 

GCCTTTATTGACACTATAAAGAGTCTTGATTACAAGTCTTTCAAAACCATTGTTGAGTCCTGCGGTAACTA 

TAAAGTTACCAAGGGAAAGCCCGTAAAAGGTGCTTGGAACATTGGACAACAGAGATCAGTTTTAACACCAC 

TGTGTGGTTTTCCCTCACAGGCTGCTGGTGTTATO^GATCAATTTTTGCGCGCACACTTGATGCAGCA^ 

CACTCAATTCCTGATTTGCAAAGAGCAGCTGTCACCATACTTGATGGTATTTCTGAACAGTCATO 

TGTCGACGCCATGGTTTATACTTCAGACCTGCTCACCy^CAGTGTCATTATTATGG^ 

GTCTTGTACAACAGACTTCTCAGTGGTTGTCTAATCTTTTGGGCACTACTGTTGAAAAACTCAGGCC^ 

TTTGAATGGATTGAGGCGAAACTTAGTGCAGGAGTTGAATTTCTCAAGGATGCTTGGGAGATTCTCAAATT 

TCTCATTACAGGTGTTTTTGACATCGTC?^GGGTCAAATACAGGTTGCTTCAGATAACATCAAGGATTGTG 

TAAAATGCTTCATTGATGTTGTTAACAAGGCACTCGAAATGTGCATTGATCT^GTCACTATCGCTGGCGCA 

AAGTTGCGATCACTCAACTTAGGTGAAGTCTTCATCGCTCAAAGCAAGGGACTTTACCGTCAGTGTA 

TGGCAAGGAGCAGCTGC7ACTACTCATGCCTCTTAAGGCACCAAAAGAAGTAACCTTTCTTGAAGGTGATT 

CACATGACACAGTACTTACCTCTGAGGAGGTTGTTCTCAAGAACGGTGAACTCGAAGCACTCGAGACGCCC 

GTTGATAGCTTCACAAATGGAGCTATCGTCGGCACACCAGTCTGTGTAAATGGCCTCAT^ 

TAAGGACAAAGAACAATACTGCGCATTGTCTCCTGGTTTACTGGCTACAAACAATGTCTTTCGCTTAAAAGi 

GGGGTGCACCAATTAAAGGTGTAACCTTTGGAGAAGATACTGTTTGGGAAGTTCAAGGTTACAAGAATGTG 

AGAATCACATTTGAGCTTGATGAACGTGTTGACAAAGTGCTTAATGAAAAGTGCTCTGTCTACACTGTTGA 

ATCCGGTACCGAAGTTACTGAGTTTGCATGTGTTGTAGCAGAGGCTGTTCTGAAGACTTTACAACCAGTTT 

CTGATCTCCTTACCAACATGGGTATTGATCTTGATGAGTGGAGTGTAGCTACATTCTACTTATTTGATGAT 

GCTGGTGAAGAAAACTTTTCATCACGTATGTATTGTTCCTTTTACCCTCCAGATGAGGAAGAAGAGGAC^ 

aXXy^GAGTGTGAGGAAGAAGAAATTGATGAT^CCTGTGAACATGAGTACGGTACAGAGGAO^TTAO^ 

GTCTCCCTCTGGAATTTGGTGCCTCAGCTGAAACAGTTCGAGTTGAGGAAGAAGAAGAGGAAGACTGGCTG 

GATGATACTACTGAGCAATCAGAGATTGAGCCAGAACCAGAACCTACACCTGAAGAACCAGTTAATCAGTT 

TACTGGTTATTTAAAACTTACTGACAATGTTGCCATTAAATGTGTTGACATCGTTAAGGAGGCACAAAGTG 

CTAATCCTATGGTGATTGTAAATGCTGCTAACATACACCTGAAACATGGTGGTGGTGTAGCAGGTCCACTC 

AACAAGGCAACCAATGGTGCCATGCAAAAGGAGAGTGATGATTACATTAAGCTAAATGGCCCTCTTACAGT 

AGGAGGGTCTTGTTTGCTTTCTGGACATAATCTTGCTAAGAAGTGTCTGCATGTTGra 

ATGCAGGTGAGGACATCCAGCTTCTTAAGGCAGCATATGAAAATTTCAATTCACAGGACATCTTAC^^ 

CCATTGTTGTCAGCAGGCATATTTGGTGCTAAACCACTTCAGTCTTTACAAGTGTGCGTGCAGACGGTTCG 

TACACAGGTTTATATTGCAGTCAATGACAAAGCTCTTTATGAGCAGGTTGTCATGGATTATCTTGATAACC 

TGAAGCCTAGAGTGGAAGCACCTAAACAAGAGGAGCCACCAAACACAGAAGATTCCAAAACTGAGGAGAAA 

TCTGTCGTACAGAAGCCTGTCGATGTGAAGCCAAAAATTAAGGCCTGCATTGATGAGGTTACCACAACACT 

GGAAGAAACTAAGTTTCTTACCAATAAGTTACTCTTGTTTGCTGATATCAATGGTAAGCTTTACC^ 

CTCAGAACATGCTTAGAGGTGAAGATATGTCTTTCCTTGAGAAGGATGCACCTTACATGGTAGGTGATC 
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ATCACTAGTGGTGATATCACTTGTGOTGTAATACCCTCCAAAAAGGCTGGTGGCACTACTGAGATGCTCTC 
AAGAGCTTTG7AGAAAGTGCCAGTTGATGAGTATATAACCACGTACCCTGGACAAGGATC 

CACTTGAGGAAGCTAAGACTGCTCTTAAGAAATGCAAATCTGCATT'rTATGTACTACCTTCAGAAGCACCT 

AATGCTAAGGAAGAGATTCTAGGAACTGTATCCTCGAATTTGAGAGAAATGCTTGCTCATGCTGAAGAGAC 

AAGAAAATTAATGCCTATATGCATGGATGTTAGAGCCATAATGGCAACCATCCAACGTAAGTATAAAGGAA 

TTAAAATTCAAGAGGGCATCGTTGACTATGGTGTCCGATTCTTCTTTTATACTAGTAAAGAGCCTGTAGCT 

TCTATTATTACGAAGdTCAACTCTCTAAATGAGCCGCTTGTCACAATGCCAATTGGTTATO 

TTTTAATCTTGAAGAGGCTGCGCGCTGTATGCGTTCTCTTAAAGCTCCTGCCGTAGTGTCJ^GTATCATCA^ 

CAGATGCTGTTACTACATATAATGGATACCTCACTTCGTCATCAAAGACATCTGAGGA 

ACAGTTTCTTTGGCTGGCTCTTACAGAGATTGGTCCTATTCAGGACAGCGTACAGAGTTAGGTGTTGAATT 

TCTTAAGCGTGGTGACAAAATTGTGTACCACACTCTGGAGAGCCCCGTCGAGTTTCATCTTGACGGTGAGG 

TTCTTTCACTTGACAAACTAAAGAGTCTCTTATCCCTGCGGGAGGTTAAGACTATAAAAGTGTTCACAACT 

GTGGACAACACTAATCTCCACACACAGCTTGTGGATATGTCTATGACATATGGACAGCAGTTTGGTCCAAC 

atacttggatggtgctgatgttacaaaaattaaacctcatgtaaatcatgagggtaacSact^ 
tacctagtgatgacacactacgtagtgaagctttcgagtactaccatactcttgatgagagttttc^^ 

AGGTACATGTCTGCTTTAAACCACACAAAGAAATGGAAATTTCCTCAAGTTGGTGGTTT^ 
ATGGGCTGATAACAATTGTTATTTGTCTAGTGTTTTATTAGCACTTCAACAGCTTGAAGTCAAATTCAATG 

caccagcacttcaagaggcttattatagagcccgtgctggtgatgctgctaacttttgtgcactcatactc 
gcttacagtaataaaactgttggcgagcttggtgatgtcagagaaactatgacccatcttctacagcatgc 
taatttggaatctgcaaagcgagttcttaatgtggtgtgtaaacattgtggtcagaaaactactaccttaa 

CGGGTGTAGAAGCTGTGATGTATATGGGTACTCTATCTTATGATAATCTTAAGACAGGTGTTTCCATTC 
TGTGTGTGTGGTCGT^ATGCTACACTATATCTAGTACAACAAGAGTCTTCTO?!^ 

acctgctgagtataaattacagcaaggtacattcttatgtgcgaatgagtacactggtaactatcagtgtg 

gtcattacactcatataactgctaaggagaccctctatcgtattgacggagctcaccttacaaagatgtca 

GAGTACAAAGGACCAGTGACTGATGTTTTCTACAAGGAAACATCTTACACTACAACCATCAAGCCTGTGTC 

gtataaactcgatggagttacttacacagagattgaaccaaaattggatgggtattataaaaaggataatg 

CTTACTATACAGAGCAGCCTATAGACCTTGTACCAACTCAACCATTACCAAATGCGAGTTTTGATAATTTC 

AAACTCACATGMCTAACACAAAATTTGCTGATGATTTAAATCAAATGACAGGCTTCACAAAGCCA^ 

ACGAGAGCTATCTGTCACATTCTTCCCAGACTTGAATGGCGATGTAGTGGCTATTGACTATAGAC^ 

CAGCGAGTTTCAAGAAAGGTGCTAAATTACTGCATAAGCCAATTGTTTGGCACATTAACCJ^ 

AAGACAACC3TTCAAAC<:A?^CACTTCGTGTTTACGTTGTCTTTGGAGTACJ\AA6CCAGTAGAT^ 

TTCATTTGAAGTTCTGGCAGTAGAAGACACACAAGGAATGGACAATCTTGCTTGTGAAAGTCAACAACCCA 

CCTCTGAAGAAGTAGTGGAAAATCCTACCATACAGAAGGAAGTCATAGAGTGTGACGTGAAAACTACCGAA 

GTTGTAGGCAATGTCATACTTAAACCATCAGATGAAGGTGTTAAAGTAACACAAGAGTTAGGTCATC 

TCTTATGGCTGCTTATGTQGAAAACACAAGCATTACCATTAAGAAACCTAATGAGCTTTCACTAGCCTTAG 

GTTTAAAAACAATTGCCACTCATGGTATTGCTGCAATTAATAGTGTTCCTTGGAGTA 

GTCAAACCATTCTTAGGACAAGCAGCAATTACAACy^TCAAATTGCGCTAAGAGATTAGCACAACGT^ 

TAACAATTATATGCCTTATGTGTTTACATTATTGTTCCAATTGTGTACTIOTACTAJ^ 

gaattagagcttcactacctacaactattgctaaaaatagtgttaagagtgttgctaaattatgtt^ 
gccggcattaattatgtgaagtcacccaaattttctaaattgttcacaatcgctatgtggctattgttgtt 

AAGTATTTGCTTAGGTTCTCTAATCTGTGTAACTGCTGCTTTTGGTGTACTCTTATCTAATTTTGGTGCTC 

cttcttattgtaatggcgttagagaattgtatcttaattcgtctaacgttactactatggatttctgtgaa 

GGTTCTTTTCCTTGCAGCATTTGTTTAAGTGGATTAGACTCCCTTGATTCTTATCCAGCTCTO^^ 

TCAGGTGACGATTTCATCGTACAAGCTAGACTTGACAATTTTAGGTCTGGCCGCTGA^ 

ATATGTTGTTCACAAAATTCOTyrTATTTATTAGGTCTTTCAGCTATAATGCAGGTC 

GCTAGTCATTTCATCAGCAATTCTTGGCTCATGTGGTTTATCATTAGTATTGTACAAATGGCACCCGTTTC 

TGCAATGGTTAGGATGTACATCTTCTTTGCTTCTTTCTACTACATATGGAAGAGCTATGTTCATATCATGG 

ATGGTTGCACCTCTTCGACTTGCATGATGTGCTATAAGCGCAATCGTGCCACACGCGTTGAGTGTACAACT 

ATTGTTAATGGCATGAAGAGATCTTTCTATGTCTATGCAAATGGAGGCCGTGGCTTCTGCAAGACTCACAA 

TTGGAATTGTCTCAATTGTGACACATTTTGCACTGGTAGTACATTCATTAGTGATGAAGTTGCT^ 

TGTCACTCCAGTTTAAAAGACCAATCAACCCTACTGACCAGTCATCGTATATTGTTGATAGTGTTGCTGTG 

AAAAATGGCGCGCTTCACCTCTACTTTGACAAGGCTGGTCAAAAGACCTATGAGAGACATCCGCTCTCCC^ 

ttttgtcaatttagacaatttgagagctaacaacactaaaggttcactgcctattaatgtcatagtttttg 

atggcaagtccaaatgcgacgagtctgcttctaagtctgcttctgtgtactacagtcagctgatgtgccaa 

cctattctgttgcttgaccaagctcttgtatcagacgttggagatagtactgaagtttccgttaagatc 

tgatgcttatgtcgacaccttttcagcaacttttagtgttcctatggaaaaacttaaggcacttgttgcta 

cagctcacagcgagttagcaaagggtgtagctttagatggtgtcctotctacattcgtgtc 
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CAAGGTGTTGTTGATACCGATCTTGACACAAAGGATGTTATTGAATGTCTCAAACTTTCA^^ 
CTTAGAAGTGACAGGTGACAGTTGTAACAATTTa^TCCTCACCTATAATAAG^ 

GAGATCTTGGCGCATGTATTGACTGTAATGCAAGGCATATCAATGCCCAAGTAGCAAAAAGTCACAATGTT 

TCACTCATCTGGAATGTAAAAGACTACATGTCTTTATCTGAACAGCTGCGTAAACAAATTCGTAGTGCTGC 

CAAGAAGAACAACATACCTTTTAGACTAACTTGTGCTACAACTAGACAGGTTGTCAATGTCATAACT 

AAATCTCACTCAAGGGTGGTAAGATTGTTAGTACTTGTTTTAAACTTATGCTTAAGGCCACATTATTGT^ 

GTTCTTGCTGCATTGGTTTGTTATATC6TTAT6CCAGTACATACATTGTCAATC 

TGAAATCATTGGTTACAT^GCCATTCAGGATGGTGTCACTCGTGACATCATTTCTAC 

CAAATAAACATGCTGGTTTTGACGCATGGTTTAGCCAGCGTGGTGGTTCATACAAT^AATGAC^^ 

CCTGTAGTAGCTGCTATCATTACAAGAGAGATTGGTTTCATAGTGCCTGGCTTACCGGGTACTGTGCTGA6 

AGCAATCAATGGTGACTTCTTGCATTTTCTACCTCGTGTTTTTAGTGCTGTTGGCAACATTTGCTACACAC 

CTTCCAAACTCATTGAGTATAGTGATTTTGCTACCTCTGCTTGCGTTCTTGCTGCTGAGTGTACy^TTl^ 

AAGGATGCTATGGGCAAACCTGTGCCATATTGTTATGACACTAATTTGCTAGAGGGTTCTATTTCTTATA^ 

TGAGCTTCGTCCAGACyvCTCGTTATGTGCTTATCGATGGTTCCATCATACAGTTTCCTAACAC 

AGGGTTCTGTTAGAGTAGTAACAACTTTTGAaXK:TGAGTACTGTAGACATGGTACATGCGAAAGGT^ 

GTAGGTATTTGCCTATCTACCAGTGGTAGATGGGTTCTTAATAATGAGCATTACAGAGCTCTATCAGGAGT 

TTTCTGTGGTGTTGATGCGATGAATCTCATAGCTAACATCTTTACTCCTCTTGTGCAACCTGTGGGTGCTT 

TAGATGTGTCTGCTTCAGTAGTGGCTGGTGGTATTATTGCCATAOTrGGTGACTTGTGCTGCCTACTACTTT 

ATGAAATTCAGACGTGTTTTTGGTGAGTACAACCATGTTGTTGCTGCTAATGCACTTTTGTTTTTGATGTC 

TTTCACTATACTCTGTCTGGTACCAGCTTACAGCTTTCTGCCGGGAGTCTACTCAGTCTTTTACTTGTACT y 

TGACATTCTATTTCACCAATGATGTTTCATTCTTGGCTCACCTTCAATGGTTT^ 

GTGCCTTTTTGGATAACAGCAATCTATGTATTCTGTATTTCTCTGAAGCACTGCCATT^ 

CTATCTTAGGAAAAGAGTCATGTTTAATGGAGTTACATTTAGTACCTTCGAG6AGGCTGCTTTG 

TTTTGCTCAACAAGGAAATGTACCTAAAATTGCGTAGCGAGACACTGTTGCCACTTACACAGTATAACAGG 

TATCTTGCTCTATATAACAAGTACAAGTATTTCAGTGGAGCCTTAGATACTACCAGCTATCGTGAAGCAGC 

TTGCTGCCACTTAGCAAAGGCTCTAAATGACTTTAGCAACTCAGGTGCTGATGTTCTCTACCAACCACCAC 

agacatcaatcacttctgctgttctgcagagtggttttaggaaaatggcattcccgtcaggcaaagttg;^ 

GGGTGCATGGTACAAGTAACCTGTGGAACTACAACTCTTAATGGATTGTGGTTGGATGAC^ 

TCCAAGACATGTCATTTGCACAGCAGAAGACATGCTTAATCCTAACTATGAAGATCTGCI^ 

CCAACCATAGCTTTCTTGTTCAGGCTGGCAATGTTCAACTTCGTGTTATTGGCCy^T^ 

ctgcttaggcttaaagttgatacttctaaccctaagacacccaagtataaatttgtccgtatccaacctgg 

TCAAACATTTTCAGTTCTAGCATGCTACAATGGTTCACCATCTGGTGTTTATCAGTGTGCCATGAGACCTA 

atcataccattaaaggttctttccttaatggatcatgtggtagtgttggttotaacattgattatgattgc 

GTGTCTTTCTGCTATATGCATCATATGGAGCTTCCAACAGGAGTACACGCTGGTACTGACTTAGAAGGTAA 

attctatggtccatttgttgacagacaaactgcacaggctgcaggtacagacacaaccataacattaaatg 

TTTTGGCATGGCTGTATGCTGCTGTTATCAATGGTGATAGGTGGTOTCTTAATAGATTCA 

aatgactttaaccttgtggcaatgaagtacaactatgaacctttgacacaagatcatgttgacatattggg 

ACCTCTTTCTGCTCAAACAGGAATTGCCGTCTTAGATATGTGTGCT6CTTTGAAAGAGCTGCTGCAGAATG 

gtatgaatggtcgtactatccttggtagcactattttagaagatgagtttacaccatttgatgttgttaga 

CAATGCTCTGGTGTTACCTTCCAAGGTAAGTTCAAGAAAATTGTTAAGGGCACTCATCATTGGATGCTTTT 

aactttcttgacatcactattgattcttgttcaaagtacacagtggtcactgtttttctttgtttacgaga 

ATGCTTTCTTGCCATTTACTCTTGGTATTATGGCAATTGCTGCATGTGCTATGCTGCTTGTTAAGC^ 

CACGCATTCTTGTGCTTGTTTCTGTTACCTTCTCTTGCAACy^GTTGCTTACTTTAATATGGTC 

TGCTAGCTGGGTGATGCGTATCATGACATGGCTT6AATTGGCTGACACTAGCTTGTCTGGTTATAG 

AGGATTGTGTTATGTATGCTTCAGCTTTAGTTTTGCTTATTCTCATGACAGCTCGCACTGTTTATGAT^ 

GCTGCTAGACGTGTTTGGACACTGATGAATGTCATTACACTTGTTTACAAAGTCTACTATGGTAATGCTTT 

AGATCAAGCTATTTCCATGTGGGCCTTAGTTATTTCTGTAACCTCTAACTATTCTGGTGTCGTTACGACTA 

TCATGTTTTTAGCTAGAGCTATAGTGTTTGTGTGTGTTGAGTATTACCCATTGTTATTTATTACTGGCAAC 

ACCTTACAGTGTATCATGCTTGTTTATTGTTTCTTAGGCTATTGTTGCTGCTGCTACTTTGGCCT^ 

TTTACTCAACCGTTACTTCAGGCTTACTCTTGGTGTTTATGACTACTTGGTCTCTACACAAGAATO 

ATATGAACTCCCAGGGGCTTTTGCCTCCTAAGAGTAGTATTGATGCTTTCAAGCTTAACATTAAGTTGT^ 

GGTATTGGAGGTAT^CCATGTATCAAGGTTGCTACTGTACAGTCTAAAATGTCTGACGTAAAGTCCACATC 

TGTGGTACTGCTCTCGGTTCTTCAACAACTTAGAGTAGAGTCATCTTCTAAATTGTGGGCACAATGTGTAC 

AACTCCACAATGATATTCTTCTTGCAAAAGACACAACTGAAGCTTTCGAGAAGATGGTTTCTCTTTTGTCT 

GTTTTGCTATCCATGCAGGGTGCTGTAGACATTAATAGGTTGTGCGAGGAAATGCTCGATAACCGTGCTAC 

TCTTCAGGCTATTGCTTCAGAATTTAGTTCTTTACCATCATATGCCGCTTATGCCACTGCCCAGGAGGCCT 

ATGAGCAGGCTGTAGCTAATGGTGATTCTGAAGTCGTTCTCAAAAAGTTAAAGAAATCT^ 
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AAATCTGAGTTTGACCGTGATGCTGCCATGCAACGCAAGTTGGAAAAGATGGCAGATCAGGC 

AATGTACAAACAGGCAAGATCTGAGGACAAGAGGGCAAAAGTAACTAGTGCTATGCAAACAATGCTCT^^ 

CTATGCTTAGGAAGCTTGATAATGATGCACTTAACAACATTATCAACAATGCGCGTGATGGTTGTGT^ 

CTCAACATCATACCATTGACTACAGCAGCCT^CTCATGGTTGTTGTCCCTGATTATGGTACCTACAAGAA 

CACTTGTGATGGTAACACCTTTACATATGCATCTGCACTCTGGGAAATCCAGCAAGTTGTTGATGCGGATA 

GCAAGATTGTTCAACTTAGTGAAATTAACATGGACAATTCACCAAATTTGGCTTGGCCTCTTATTGTTACA 

GCTCTAAGAGCCAACTtAGCTGTTAAACTACAGAATAATGAACTGAGTCCAGTAGCACTACGAqAGATGTC 

CTGTGCGGCTGGTACCACACAAACAGCTTGTACTGATGACAATGCACTTGCCTACTATAACAATTC 

GAGGTAGGTTTGTGCTGGCATTACTAT»GACCACCAAGATCTCAAATGGGCTAGATTCC^ 

GGTACAGGTACAATTTACACAGAACTGGAACCACCTTGTAGGTTTGTTACAGACACACCAAAAGGGCCTAA 

AGTGAAATACTTGTACTTCATCAAAGGCTTAAACAACCTAAATAGAGGTATGGTGCTGGGCAGTTTAGCTG 

CTACAGTACGTCTTCAGGCTGGAAATGCTACAGAAGTACCTGCCAATTCAACTGTGCTTTCCTTCTGTGCT . 

TTTGCAGTAGACCCTGCTAAAGCATATAAGGATTACCTAGCAAGTGGAGGACAACCAATCACCAACTGT6T 

GAAGATGTTGTGTACACACACTGGTACAGGACAGGCAATTACTGTAACACCAGAAGCTAACATGGACCAAG 

AGTCCTTTGGTGGTGCTTCATGTTGTCTGTATTGTAGATGCCACATTGACCATC 

TGTGACTTGAAAGGTAAGTACGTCCAAATACCTACCACOTCTGCTAATGACCCAGTGGGTT^ 

AAACACAGTCTGTACCGTCTGCGGAATGTGGAAAGGTTATGGCTGTAGTTGTGACCAACTCCGCGAACCCT 

TGATGCAGTCTGCGGATGCATCAACGTTTTTAAACGGGTTTGCGGTGTAAGTGCAGCCCGTCTTACACCGT 

GCGGCACAGGCACTAGTACTGATGTCGTCTACAGGGCTTTTGATATTTACAACGAAAAAGTTGCTGGTTTT 

GCAAAGTTCCTAT^AAACTAATTGCTGTCGCTTCCAGGAGAAGGATGAGGAAGGCAATTTATTAGACTCOT 

CTTTGTAGTTAAGAGGCATACTATGTCTAACTACCAACATGAAGAGACTATTTATAACTTGGTTAAAGATT 

GTCCAGCGGTTGCTG^CCATGACTTTTTCAAGTTTAGAGTAGATGGT^ 

CAGCGTCTAACTAT^TACACAATGGCTGATTTAGTCTATGCTCTACGTCATTTTGATGAGGGTAAT 

TACATTAAAAGAAATACTCGTCACATACAATTGCTGTGATGATGATTATTTCAATAAGAAGGATTGGTATG 

ACTTCGTAGAGAATCCTGACATCTTACGCGTATATGCTAACTTAGGTGAGCGTGTACGCCAATCATTATTA 

AAGACTGTACAATTCTGCGATGCTATGCGTGATGCAGGCATTGTAGGCGTACTGACATTAGATAATCAGGA 

TCTTAATGGGAACTGGTACGATTTCGGTGATTTCGTACAAGTAGCACCAGGCTGCGGAGTTCCTATTGTGG 

ATTCATATTACTCATTGCTGATGCCCATCCTCACTTTGACTAGGGCATTGGCTGCTGAGTCCCATATGGAT 

GCTGATCTCGCAAAACCACTTATTi^AGTGGGATTTGCTGAAATATGATTTTACGGAAGAGAGACTO 

CTTCGACCGTTATTTTAAATATTGGGACCAGACATACCATCCCAATTGTATTAACT6TTTGGATGATAGGT 

GTATCCTTCATTGTGCAAACTTTAATGTGTTATOTTCTACTGTGTrrcCACCTACAAGT^ 

GTAAGAAAAATATTTGTAGATGGTGTTCCTTTTGTTGTTTCAACTGGATACCATTTTCGTGAGTTAGGAGT 

CGTACATAATCAGGATGTAAACTTACATAGCTCGCGTCTCAGTTTCAAGGAACTTTTAGTGTATGCTGCTG 

ATCCAGCTATGCATGCAGCTTCTGGCAATTTATTGCTAGATAAACGCACTACATGCTTTTCAGTAGCTGCA 

CTAACAAACAATGTTGCTTTTCAAACTGTCAAACCCGGTAATTTTAATAAAGACTTTTATGACTTTGCTGT 

GTCTAAAGGTTTCTTTAAGGAAGGAAGTTCTGTTGAACTAAAACACTTCTTCTT^^ 

CTGCTATCAGTGATTATGACTATTATCGTTATAATCTGCCAACAATGTGTGATATCAGACAACTCCTATTC 

GTAGTTGAAGTTGTTGATAAATACTTTGATTGTTACGATGGTGGCTGTATTAATGCCAACCAAGTAATTO 

TAACAATCTGGATAAATCAGCTGGTTTCCCATTTAATAAATGGGGTAAGGCTAGACTTTATTATGACTCAA 

TGAGTTATGAGGATCAAGATGCACTTTTCGCGTATACTAAGCGTAATGTCATCCCTACTATAACTCAAATG 

AATCTTAAGTATGCCATTAGTGCAAAGAATAGAGCTCGCACCGTAGCTGGTGTCTCTATCTGTAGTACTAT 

GACAAATAGACAGTTTCATCAGAAATTATTGAAGTCAATAGCCGCCACTAGAGGAGCTACTGTGGTAATTG 

gaacaagcaagttttacggtggctggcataatatgttaaaaactgtttacagtgatgtagaaactc^ 

cttatgggttgggattatccaaaatgtgacagagccatcccraacatgcto * 

tcttgctcgcaaacataacacttgctgtaacttatcacaccgtttctacaggttagctaaosagtgtg^ 

aagtattaagtgagatggtcatgtgtggcggctcactatatgttaaaccaggtggaacatcatccggtgat 

gctacaactgcttatgctaatagtgtcttttyicatttgtcaagctgttacagccaatgtaaatgcacttct 

ttcaactgatggtaataAgatagctgacaagtatgtccgcaatctacaacacaggctctatgagtgtctct 

atagaaatagggatgttgatcatgaattcgtggatgagttttacgcttacctgcgtaaacatttctccatg 

atgattctttctgatgatgccgttgtgtgctataacagtaactatgcggctcaaggttt^ 

TAAGAACTTTAAGGCAGTTCTTTATTATO^AAATAATGTGTTCATGTCTGAGGC^^ 

ctgaccttactaaaggacctcacgaattttgctcacagcatacaatgctagttaaacaaggagatgattac 

GTGTACCTGCCTTACCCAGATCCATCAAGAATATTAGGCGCAGGCTGTTTTGTCGATGATATTGTCAAAAC 
AGATGGTACACTTATGATTGAAAGGTTCGTGTCACTGGCTATTGATGCTTACCCACTTACAAAACATCCTA 
ATCAGGAGTATGCTGATGTCTTTCACTTGTATTTACAATACATTAGT^AAGTTACATGATGAGCTTACTGGC 
CACATGTTGGACATGTATTCCGTAATGCTAACTAATGATAACACCTCACGGTACTGGGAACCTGAGTTTTA 
TGAGGCTATGTACACACCACATACAGTCTTGCAGGCTGTAGGTGCTTGTGTATTGTGCAAT!^ 
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CACTTCGTTGCGGTGCCTGTATTAGGAGACCATTCCTATGTTGCAAGTGCTGCTATGAC^ 

ACATCACACAAATTAGTGTTGTCTGTTAATCCCTATGTTTGCAATGCCCCAGGTTGTGATGTCACTGATGT 

GACACAACTGTATCTAGGAGGTATGAGCTATTATTGCAAGTCACATAAGCCTCCCATTAGTTTTCCATTAT 

GTGCTAATGGTCAGGTTTTTGGTTTATACAAAAACACATGTGTAGGCAGTGACAATGTCJVCTGACTTC^ 

GCGATAGCAACATGTGATTGGACTAATGCTGGCGATTACATACTTGCCAACACTTGTACTGAGAGACTCAA 

GCTTTTCGCAGCAGAAACGCTCAAAGCCACTGAGGAAACATTTAAGCTGTCATATGGTATT^ 

GCGAAGTACTCTCTGACAGAGAATTGCATCTTTCATGGGAGGTTGGAAAACCTAGACCACC^ 

AACTATGTCTTTACTGGTTACCGTGTJ^CTAAAAATAGTAAAGTACAGATTGGAGAGTAC^ 

AGGTGACTATGGTGATGCTGTTGTGTACAGAGGTACTACGACATACAAGTTGAATGlTOGTGATO 

TGTTGACATCTCACACTGTAATGCCACTTAGTGCACCTACTCTAGTGCCACAAGAGCACTATGTGAGAATT 

ACTGGCTTGTACCCAACACTCAACATCTCAGATGAGTTTTCTAGCAATGTTGCAAATTATCAAAAGGTCGG 

CATGCAAAAGTACTCTACACTCCAAGGACCACCTGGTACTGGTAAGAGTCATTTTGCCATCGGACTTGCTC 

TCTATTACCCATCTGCTCGCATAGTGTATACGGCATGCTCTCATGCAGCTGTTGATGCCCTATGTGAAAAG 

GCATTAAAATATTTGCCCATAGATAAATGTAGTAGAATCATACCTGCGCGTGCGCGCGTAGAGTGTTT^ 

TAAATTCAT^GTGAATTCAACACTAGAACAGTATGTTTTCTGCACTGTAAATGC^ 

CTGACATTGTAGTCTTTGATGAAATCTCTATGGCTACTAATTATGACTTGAGTGTTGTCAATGCTAGACTT 

CGTGCAAAACACTACGTCTATATTGGCGATCCTGCTCAATTACCAGCCCCCCGCACATTGCTGACTAAAGG 

CACACTAGAACCAGAATATTTTAATTCAGTGTGCAGACTTATGAAAACAATAGGTCCAGACATGTTCCTTG 

GTiACTTGTCGCCGTTGTCCTGCTGAAATTGTTGACACTGTGAGTGCTTTAGTTTATGACAATAAGCTAAAA 

GCACACAAGGATAAGTCAGCTCAATGCTTCAAAATGXTCTACAAAGGTGTTATTACACATGATGTTT^ 

TGCAATCAACAGACCTCAAATAGGCGTTGTAAGAGAATTTCTTACACGCTATCCTGCT^ 

TTTTTATCTCACCTTATAATTCACAGAACGCTGTAGCTTCAAAAATCTTAGGATTGCCTACGCAGACTGTT 

GATTCATCACAGGGTTCTGAATATGACTATGTCATATTCACACAAACTACTGAAACAGCACACTCTTGTAA 

TGTCAACCGCTTCAATGTGGCTATCACAAGGGCAAAAATTGGCATTTTGTGCATAATGTCTGATAGAGATC 

TTTATGACAAACTGCAATTTACAAGTCTAGAAATACCACGTCGCAATGTGGCTACATTACAAGCAGAAAAT 

GTAACTGGACTTTTTAAGGACTGTAGTAAGATCATTACTGGTCTTCATCCTACACAGGCACCTACACACCT 

CAGCGTTGATATAAAGTTCAAGACTGAAGGATTATGTGTTGACATACCAGGCATACCAAAGG^ 

ACCGTAGACTCATCTCTATGATGGGTTTCAAAATGAATTACCAAGTCAATGGTTACCCTAATATGTTTATC 

ACCCGCGAAGAAGCTATTCGTCACGTTCGTGCGTGGATTGGCTTTGATGTAGAGGGCTGT^ 

AGATGCTGTGGGTACTAACCTACCTCTCCAGCTAGGATTTTCTACAGGTGTTAACTTAGTAGCTGTACCGA 

CTGGTTATGTTGACACTGAAAATAACACAGAATTCACCAGAGTTAATGCAAAACCTCCACCAGGTGACCAG 

TTTAAACATCTTATACCACTCATGTATAAAGGCTTGCCCTGGAATGTAGTGCGTATTAAGATAGTACAAAT 

GCTCAGTGATACACTGAAAGGATTGTCAGACAGAGTCGTGTTCGTCCTTTGGGCGCATGGCTTTGAGCTTA 

CATCAATGAAGTACTTTGTCAAGATTCGACCTGT^GAACGTGTTGTCTGTGTGACAAACGTGCAACTT^ 

TTTTCTACTTCATCAGATACTTATGCCTGCTGGAATCATTCTGTGGGTTTTGACTATGTCT^ 

TATGATTGATCTTCAGCAGTGGGGCTTTACGGGTAACC^TCAGAGTAACCATGACCAACATTGC^ 

ATGGAAATGCACATGTGGCTAGTTGTGATGCTATCATGACTAGATGTTTAGCAGTCCATGAGTGCT^^ 

AAGCGCGTTGATTGGTCTGTTGAATACCCTATTATAGGAGATGAACTGAGGGTTAATTCTGCTTGCAGAAA 

AGTACAACACATGGTTGTGAAGTCTGCATTGCTTGCTGATAAGTTTCCAGTTCTTCATGACATTGGAAATC 

CAAAGGCTATCAAGTGTGTGCCTCAGGCTGAAGTAGAATGGAAGTTCTACGATGCTCAGCCATGTAGTGAC 

AAAGCTTACAAAATAGAGGAACTCTTCTATTCTTATGCTACACATCACGATAAATTCACTGATGGTGTTTG 

TTTGTTTTGGAATTGTAACGTTGATCGTTACCCAGCCAATGCAATTGTGTGTA^ 

TGTCAJ^ACTTGAACTTACCAGGCTGTGATGGTGGTAGTTTGTATGTGAATAAGCATGCATTCCACACTCC^ 

GCTTTCGATAAAAGTGCATTTACTAATTTAAAGCAATTGCCTTTCTTTTACTATTCTGATAGTCCTTG 

GTCTCATGGCAAACAAGTAGTGTCGGATATTGATTATGTTCCACTCAAATCTGCTACGTGTATTACACGAT 

GCAATTTAGGTGGTGCTGTTTGCAGACACCATGCAAATGAGTACCGACAGTACTTGGATGCATATAATATG 

ATGATTTCTGCTGGATTTAGCCTATGGATTTACAAACAATTTGATACTTATAACCTGTGGAATACATTTAC 

CAGGTTACAGAGTTTAGAAAATGTGGCTTATAATGTTGTTAATAAAGGACACTTTGATGGAC^ 

AAGCACCTGTTTCCATCATTAATAATGCTGTOTACACAAAGGTAGATGGTATTGATGTGGAGATCT 

AATAAGACAACACTTCCTGTTAATGTTGCATTTGAGCTTTCGGCTAAGCGTAACATTAAACCAGTCCC^ 

GATTAAGATACTCAATAATTTGGGTGTTGATATCGCTGCTAATACTGTAATCTGGGACTACAAAAGAGAAG 

CCCCAGCACATGTATCTACAATAGGTGTCTGCACAATGACTGACATTGCCAAGAAACCTACTGAGAGTGCT 

TGTTCTTCACTTACTGTCTTGTTTGATGGTAGAGTGGAAGGACAGGTAGACCTTTTTAGAAACGCCCGTAA 

TGGTGTTTTAATAACAGAAGGTTCAGTCAAAGGTCTAACACCTTCAAAGGGACCAGCACAAGCTAGCGTCA 

ATGGAGTCACATTAATTGGAGAATCAGTAAAAACACAGTTTAACTACTTTAAGAAAGTAGACGGCATTATT 

CAACAGTTGCCTGAT^CCTACTTTACTCAGAGCAGAGACTTAGAGGATTTTAAGCCCUVGATCACAA^ 

AACTGACTTTCTCGAGCTCGCTATGGATGT^TTCATACAGCGATATAAGCTCGAGGGCTATGCCTTCGAAC 
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ACATCGTTTATGGAGATTTCAGTCATGGACAACTTGGCGGTCTTCATTTAATGATAGGCTTA^ 

TCACAAGATTCACCACTTAAATTAGAGGATTTTATCCCTATGGACAGCACAGTGAAAAATTACTTCATi^ 

AGATGCGCAAACAGGTTCATCAAT^TGTGTGTGTTCTGTGATTGATCTTTTACTTGATGACTTTGTCGAGA 

TAATAAAGTCACAAGATTTGTCAGTGATTTCAAAAGTGGTCAAGGTTACAATTGACTATGCTGAAATTTCA 

TTCATGCTTTGGTGTAAGGATGGACATGTTGAAACCTTCTACCCAAAACTACAAGCAAGTCGAGCGTGGC^ 

ACCAGGTGTTGCGATCCCTAACTTGTACAAGATGCAAAGAATGCTTCTTGAAAAGTGTGACCTTCAGAATT 

ATGGTGAAAATGCTGtltATACCAAAAGGAATAATGATGAATGTCGCAAAGTATACTa^ 

TTAAATACACTTACOm'AGCTGTACCCTACAACATGAGAGTTATTCACTTTGGTC 

AGTTGCACCAGGTACAGCTGTGCTCAGACAATGGTTGCCAACTGGCACACTACTTGTCGATTCAGATCTTA 

ATGACTTCGTCTCCGACGCATATTCTACTTTAATTGGAGACTGTGCAACAGTACATACGGCTAATAAATGG 

GACCTTATTATTAGCGATATGTATGACCCTAGGACCAAACATGTGACAAAAGAGAATGACTCTAAAGAAGG 

GTTTTTCACTTATCTGTGTGGATTTATAAAGCAAAAACTAGCCCTGGGTGGTTCTATAGCTGTAAAGATAA- 

CAGAGCATTCTTGGAAOXSCTGACCTTTACAAGCTTATGGGCCATTTCTCATGGTGGAC^ 

AATGTAAATGCATCATCTVTaSGAAGCATTTTTAATTGGGGCTAACTATC 

TGATGGCTATACCATGCATGCTAACTACATiTTCTGGAGGAACACT^TCCTATCCAGTTC 

CACTCTTTGACATGAGCAAATTTCCTCTTAAATTAAGAGGAACTGCTGTAATCTCTCTTAAGGAGAATCAA 

ATCAATGATATGATTTATTCTCTTCTGGAAAAAGGTAGGCTTATCATTAGAGAAAACAACAGAGTTGTGGT 

TTCAAGTGATATTCTTGTTAACAACTAAACGAACATGTTTATTTTCTTATTATTTCTTACTCTCACTAGTG 

GTAGTGACCTTGACCGGTGCACCACTTTTGATGATGTTCAAGCTCCTAATTACACTCAACATACTTCA 

ATGAGGGGGGTTTACTATCCTGATGAAATTTTTAGATCAGACACTCTTTATTTAACTCAGGATTTATT^ 

TCCATTTTATTCTAATGTTACAGGGTTTCATACTATTAATCATACGTTTC^ 

AGGATGGTATTTATTt!tGCT(3CCACAGAGAAATCAAATGTTGTCCGTGGTTGGGTTTTTGGTO 

AACAACAAGTCACAGTCGGTGATTATTATTAACAATTCTACTAATGTTGTTATACGAGCATGTAACTTTGA 

attgtgtgacaaccctttctttgctgtttctaaacccatgggtacacagacacatactatgatattcgata 

ATGCATTTAATTGCACTTTCGAGTACATATCTGATGCCTTTTCGCTTGATGTTTCAGAAAAGTCAGGTAAT 

tttaaacacttacgagagtttgtgtttaaaaataaagatgggtttctctatgtttataagggctatcaacc 

TATAGATGTAGTTCGTGATCTACCTTCTGGTTTTAACACTTTGAAACCTATTTTTAAGTTGCCTCTTGGTA 

ttaacattacaaattttagagccattcttacagccttttcacctgctcaagacatttggggcacgtcagct 

GCAGCCTATTTTGTTGGCTA'ITTAAAGCCAACTACATTTATGCTCAAGTATCATGA^^ 

agatgctgttgattgttctcaaaatccacttgctgaactcaaatgctctgttaagagctttgagattgaca 

AAGGAATTTACCAGACCTCTAATTTCAGGGTTGTTCCCTCAGGAGATGTTGTGAGATTCCCTAATATTACA 

AACTTGTGTCCTTTTGGAGAGGTTTTTAATGCTACTAAATTCCCTTCTGTCTATGCATGGGAGAGAAAAAA 

AATTTCTAATTGTGTTGCTGATTACTCTGTGCTCTACAACTCAACATTTTTTTCAACCTTTAAGTGCTATG 

GCGTTTCTGCCyVCTAAGTTGAATGATCTTTGCTTCTCCAATGTCTATGCAGATTCTTTTGTAG^ 

GATGATGTAAGACAAATAGCGCCAGGACAAACTGGTGTTATTGCTGATTATAATTATAAA^^ 

TTTCATGGGTTGTGTCCTTGCTTGGAATACTAGGAACATTGATGCTACTTCAACTGGTAATTAT^ 

AATATAGGTATCTTAGACATGGCAAGCTTAGGCCCTTTGAGAGAGACATATCTAATGTGCCTTTCTCCCCT 

GATGGCAAACCTTGCACCCCACCTGCTCTTAATTGTTATTGGCCATTAAATGATTATGGTTTTTACACCAC 

TACTGGCATTGGCTACCAACCTTACAGAGTTGTAGTACTTTCTTTTGAACTTTTAAATGCACCGGCCACX3G 

•tttgtggaccaaaattatccactgaccttattaagaaccagtgtgtcaattttaattttaatggac 
ggtactggtgtgttaactccttcttcaaagagatttcaaccatttcaacaatttggccgtgatgt^ 
oittcactgattccgttcgagatcctaaaacatctgaaatattagacatttcaccttc 

TAAGTGTTATTACACCTGGAACAAATGCTTCATCTGAAGTTGCTGTTCTATATCAAGATGTTAACTGCACT 

gatgtttctacagcaattcatgcagatcaactcacaccagcttggcgcatatattctactggaaacaatgt 
attccagactcaagcaggctgtcttataggagctgagcatgtcgacacttcttatgagtgcgacattccta 
ttggagctggcatttgtgctagttaccatacagtttctttattacgtagtactagccaaaaatctattgtg 

GCTTATACTATGTCTTTAGGTGCTGATAGTTCAATTGCTTACTCTAATAACACCATTGCTATACCTACTAA 

cttttcaattagcattactacagaagtaatgcctgtttctatggctaaaacctccgtagattgtaatatc 

ACATCTGCGGAGATTCTACTGAATGTGCTAATTTGCTTCTCCAATATGGTAGCTTTTGCAC^ 

CGTGCACTCTCAGGTATTGCTGCTGAACAGGATCX3CAACACACGTGAAGTGTTCGCTCAAGTCAAACA 

GTACAAAACCCCAACTTTGAT^TATTTTGGTGGTTTTAATTTTTCACAAATATTACCTC 

CAACTAAGAGGTCTTTTATTGAGGACTTGCTCTTTAATAAGGTGACACTCGCTGATGCTGGCTTCATGAAG 

CAATATGGCGAATGCCTAGGTGATATTAATGCTAGAGATCTCATTTGTGCGCAGAAGTTCAATGGACTTAC 

AGTGTTGCCACCTCTGCTCACTGATGATATGATTGCTGCCTACACTGCTGCTCTAGTTAGTGGTACTGCCA 

CTGCTGGATGGACATOTGGTGCTGGCGCTGCTCTTCAAATACCTTTTGCTATGCAAATGGCATATA^^ 

AATGGCATTGGAGTTACCCAAAATGTTCTCTATGAGAACCAAAAACAAATCGCCAACCAATTT 

GATTAGTCAAATTCAAGAATCACTTACAACAACATCAACTGCATTGGGCAAGCTGCTAGACGT^ 
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AGAATGCTCT^GCATTAAACACACTTGTTAAACAACTTAGCTCTAATTTTGGTGCAAT^^ 
AATGATATCCTTTCGCGACTTGATAAAGTCGAGGCGGAGGTACAAATTGAC^ 

TCAAAGCCTTCAAACCTATGTAACACAACAACTAATCAGGGCTGCTGAAATCAGGGCTTCTGCTAATC^^ 

CTGCTACTAAAATGTCTGAGTGTGTTCTTGGACAATCAAAAAGAGTTGACTTTTGTGGAAAGGGCTACCAC 

CTTATGTCCTTCCCACAAGCAGCCCCGCATGGTGTTGTCTTCCTACATGTCACGTATGTGCCATCCCAGGA 

GAGGAACTTCACCACAGCGCCAGCAATTTGTCATGAAGGCAAAGCATACTTCCCTCGTGAAGGTGTTT^ 

TGTTTAATGGCy^CTTCTTGGTTTATTAa^CAGAGGAACTTCTTTTCTCCACAAATAATTACT^ 

ACATTTGTCTCAGGAAATTGTGATGTCGTTATTGGCATCATTAACAACACAGTTTA'^^ 

TGAGCTTGACTCATTCAAAGAAGAGCTGGACAAGTACTTCAAAAATCATACATCACCAGATGT^ 

GCGACATTTCAGGCATTAACGCTTCTGTCGTCAACATTCAAAAAGAAATTGACCGCCTCAATGAGGTCGCT 

AAAAATTTAAATGAATCACTCATTGACCTTCAAGAATTGGGAAAATATGAGCAATATATTAAATGGCCTTG 

GTATGTTTGGCTCGGCTTCATTGCTGGACTAATTGCCATCGTCATGGTTACAATCTTGCTTTGTTGCATGA 

CTAGTTGTTGCAGTTGCCTCAAGGGTGCATGCTCTTGTGGTTCTTGCTGCAAGTTTGATGAGGATGACTCT 

GAGCCAGTTCTCAAGGGTGTCAAATTACATTACACATAAACGAACTTATGGATTTGOVrTAT^ 

ACTCTTGGATCAATTACTGCACAGCCAGTAAAAATTGACAATGCTTCTCCTGCA^ 

AGCAACGATACCGCTACy^GCCTCACTCCCTTTCGGATGGCTTGTTATTGGCGTTGCATTTC 

TTCAGAGCGCTACCAAAATAATTGCGCTCAATAAAAGATGGCAGCTAGCCCTTTATAAGGGCTTCCAGTTC 

ATTTGCAATTTACTGCTGCTATTTGTTACCATCTATTCACATCTTTTGCTTGTCGCTGCAGGTATGGAGGC 

GCAATTTTTGTACCTCTATGCCTTGATATATTTTCTACAATGCATCAACGCATGTAGAATTATTATGAGAT 

GTTGGCTTTGTTGGAAGTGCAAATCCAAGAACCCATTACTTTATGATGCCAACTACTTTGTOTOCT^ 

ACACATAACTATGACTACTGTATACCATATAACAGTGTCACAGATACAATTGTCGTTACTGAAGGTGACGG 

CATTTCAACACCAAAACTCAAAGAAGACTACCAAATTGGTGGTTATTCTGAGGATAGGCACTC^ 

AAGACTATGTCGTTGTACATGGCTATTTCACCGAAGTTTACTACCAGCTTGAGTCTACACAAATTACTAC^ 

GACACTGGTATTGAAAATGCTACATTCTTCATCTTTAACAAGCTTGTTAAAGACCCACCGAATGTGCAAAT 

ACACACAATCGACGGCTCTTCAGGAGTTGCTAATCCAGCAATGGATCCAATTTATGATGAGCCGACGACGA 

CTACTAGCGTGCCTTTGTAAGCACAAGAAAGTGAGTACGAACTTATGTACTCATTCGTTTCGGAAGAAACA 

GGTACGTTAATAGTTAATAGCGTACTTCTTTTTCTTGCTTTCGTGGTATTCTTGCTAGTCACACTAGCCAT 

CCTTACTGCGCTTCGATTGTGTGCGTACTGCTGCAATATTGTTAACGTGAGTTTAGTAAAACCAACGGl^ 

ACGTCTACTCGCGTGTTAAAAATCTGAACTCTTCTGAAGGAGTTCCTGATCTTCTTC 

CTATTATTATTATTCTGTTTGGAACTTTAACATTGCTTATCATGGCAGACAACGGTACTATTACCGTTGAG 

GAGCTTAAACAACTCCTGGAACAATGGAACCTAGTAATAGGTTTCCTATTCCTAGCCTGGATTATGOTACT 

ACAATTTGCCTATTCTAATCGGAACAGGTTTTTGTACATAATAAAGCTTGTTTTCCTCTGGCTCTTGTGGC 

CAGTAACACTTGCTTGTTTTGTGCTTGCTGCTGTCTACAGAATTAATTGGGTGACTGGCGGGATTGCGATT 

GCAATGGCTTGTATTGTAGGCTTGATGTGGCTTAGCTACTTCGTTGCTTCCTTCAGGCTGTTTGCTCGTAC 

CCGCTCAATGTGGTO^TTCAACCCAGAAACAAACATTCTTCTCAATGTGCCTCTCCGGGGGACyU^ 

CCAGACCGCTCATGGAAAGTGAACTTGTCATTGGTGCTGTGATCATTCGTGGTCACTTGCGAATGGCCGGA 

CACTCCCTAGGGCGCTGTGACATTAAGGACCTGCCAATIAGAGATCACTGTGGCTACATCTVCGAACGCTT^ 

TTATTACAAAOTAGGAGCGTCGCAGCGTGTAGGCACTGATTCAGGTTTTGCTGCATACAACCGCTACCGTA 

TTGGAAACTATAAATTAAATACAGACCACGCCGGTAGCAACGACAATATTGCTTTGCTAGTACAGTAAGTG 

ACAACAGATGTTTCATCTTGTTGACTTCCAGGTTACAATAGCAGAGATATTGATTATCATTATGAGGACTT 

TCAGGATTGCTATTTGGAATCTTGACGTTATAATAAGTTCAATAGTGAGACAATTATTTAAGCCTCTAACT 

AAGAAGAATTATTCGGAGTTAGATGATGAAGAACCTATGGAGTTAGATTATCCATAAAACGAACATGAAAA 

TTATTCTCTTCCTGACATTGATTGTATTTACATCTTGCGAGCTATATCACTATCAGGAGTGTGTTAGAGGT 

ACGACTGTACTACTAAAAGAACCTTGCCCATCAGGAACATACGAGGGCAATTCACCATTTCACCCTCl^^ 

TGACAATAAATTTGCACTAACTTGCACTAGCACACACTTTGCTTTTGCTTC 

CCTATCAGCTGCGTGCAAGATCAGTTTCACCAAAACTTTTCATCAGACAAGAGGAGGTTCAACAAGAGCTC 

TACTCGCCACTTTTTCTCATTGTTGCTGCTCTAGTATTTTTAATACTTTGCTTCACCATTAAGAGAAAGAC 

AGAATGAATGAGCTCACTTTAATTGACTTCTATTTGTGCTTTTTAGCCTTTCTGCTATTCCTTGTTTTAAT 

AATGCTTATTATATTTTGGTTTTCACTCGAAATCCAGGATCTAGAAGAACCTTGTACCAAAGTCTAAACX3A 

ACATGAAACTTCTCATTGTTTTGACTTGTATTTCTCTATGCAGTT6CATATGCACTGTAGTACAGCGCTOT 

GCATCTAATAAACCTCATGTGCTTGAAGATCCTTGTAAGGTACAACACTAGGGGTAATACTTATAGCACTG 

CTTGGCTTTGTGCTCTAGGAAAGGTTTTACCTTTTCATAGATGGCACACTATGGTTCAAACATGCAC^ 

AATGTTACTATCAACTGTCAAGATCCAGCTGGTGGTGCGCTTATAGCTAGGTGTTGGTACCTTCATGAAGG 

TCACCAAACTGCTGCATTTAGAGACGTACTTGTTGTTTTAAATAAACGAACAAATTAAAATGTCTGATAAT 

GGACCCCAATCAAACCAACGTAGTGCCCCCCGCATTACATTTGGTGGACCCACAGATTCAACTGACAATAA 

CCAGAATGGAGGACGCAATGGGGCAAGGCCAAAACAGCGCCGACCCCAAGGTTTACCCAATAATACTGCGT 

CTTGGTTCACAGCTCTCACTCAGO^TGGCT^GGAGGAACTTAGATTCCCTCGAGGCCAGGGCGTTCCAATC 
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AACACCAATAGTGGTCCAGATGACCAAATTGGCTACTACCGAAGAGCTACCCGACGAGTTCGTGGTGGT^ 

CGGCAAAATGAAAGAGCTCAGCCCCAGATGGTACTTCTATTACCTAGGAACTGGCCCAGAAGCTTCACT^ 

CCTACGGCGCTAACAAAGAAGGCATCGTATGGGTTCCAACTGAGGGAGCCTTGAATACACCC^^ 

ATTGGCACCCGCAATCCTAATAACAATGCTGCCACCGTGCTACAACTTCCTCAAGGAACAACATTGCCAAA 

AGGCTTCTACGCAGAGGGAAGCAGAGGCGGCAGTCAAGCCTCTTCTCGCTCCTCATCACGTAGTCGCGGTA 

ATTCAAGAAATTCAACTCCTGGCAGCAGTAGGGGAAATTCTCCTGCTCGAATGGCTAGCGGAGGTGGTGAA 

ACTGCCCTCGCGCTATTGCTGCTAGACAGATTGAACCAGCTTGAGAGCAAAGTTTCTGGTAAAGGCCAACA 

ACAACAAGGCCAAACTGTCACTAAGAAATCTGCTGCTGAGGCATCTAAAAAGCCTCGCCAAAAACGTACTG 

CCACAAAACAGTACAACGTCy^CTCAAGCATTTGGGAGACGTGGTC^^ 

GACCAAGACCTAATCAGACa^GGAACTGATTACAAACATTGGCCGCAAATTG 

CTCTGCATTCTTTGGAATGTCACGCATTGGCATGGAAGTCACACCTTCGGGAACATGGCTGACTTATCATG 

GAGCCATTAAATTGGATGACAAAGATCCACAATTCAAAGACAACGTCATACTGCTGAACAAGCACATTGAC • 

GCATACAAAACATTCCCACCAACAGAGCCTAAAAAGGACAAAAAGAAAAAGACTGATGAAGCTCAGCCTTT 

GCCGCAGAGACAAAAGAAGCAGCCCACTGTGACTCTTCTTCCTGCGGCTGACATGGATGATTTCTCCAGAC 

AACTTCJU^TTCCATGAGTGGAGCTTCTGCTGATTCAACTCAGGCATAAACACTCAT^ 

GGCAGATGGGCTATGTAAACGTTTTCGCAATTCCGTTTACGATACATAGTCTACTCTTGTGCAGAAT^ 

TCTCGTAACTAAACAGCACAAGTAGGTTTAGTTAACTTTAATCTCACATAGaUVTCTTTAATC;^ 

ACATTAGGGAGGACTTGAAAGAGCCACCACATTTTCATCGAGGCCACGCGGAGTACGATCGAGGGTACA6T 

GAATAATGCTAGGGAGAGCTGCCTATATGGAAGAGCCCTAATGTGTAAAATTAATTTTAGTAGTGCTATCC 

CCATGTGATTTTAATAGCTTCTTAGGAGAATGACAAAAAAAAAAAAAAAAAAAAAAAA * 

GenBankAccessionNo! AY274119.1;SEQ ID NO: 1 
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CTACCCAGGAAAAGCCAACCAACCTCGATCTCTTGTAGATCTGTTCTCTAAACGAACTTTAAAATCTGT^ 

AGCTGTCGCTCGGCTGCATGCCTAGTGCACCTACGCAGTATAAACAATAATAAAOTTTACTGTaSf^ 

AGAAACGAGTAACTCGTCCCTCTTCTGCAGACTGCTTACGGTTTCGTCCGTGTTGCAGTCGAOKrATC^ 

TACCTAGGTTTCGTCCGGGTGTGACCGAAAGGTAAGATGGAGAGCCTTGTTCTTGGTGTCAACGAGAAAAC 

ACACGTCCAACTCAGTTTGCCTGTCCTTCAGGTTAGAGACGTGCTAGTGCGTGGCTTCGGGGACTCTGTGG • 

AAGAGGCCCTATCGGAGGCACGTGAACACCTCAAAAATGGCACTTGTGGTCTAGTAGAGCTGGAAAAAGGC 

GTACTGCCCCAGCTTGAACAGCCCTATGTGTTCATTAAACGTTCTGATGCCTTAAGCACCAATCACGGCC^ 

CAAGGTCGTTGAGCTGGTTGCAGT^TGGACGGCATTCAGTACGGTCGTAGCGGTATIAAC^ 

TCGTGCCACATGTGGGCGAAACCCCAATTGCATACCGCAATGTTCTTCTTCGTAAG^^ 

GCCGGTGGTCATAGCTATGGCATCGATCTAAAGTCTTATGACTTAGGTGACGAGCTTGGOVCTGAT^ 

TGAAGATTATGAACAAAACTGGAACACTAAGCATGGCAGTGGTGCACTCCGTGAACTCACTCGTGAGCT^ 

ATGGAGGTGCAGTCACTCGCTATGTCGACAACAATTTCTGTGGCCCAGATGGGTACCCTCTTGATTGCATC 

AAAGATTTTCTCGCACGCGCGGGCAAGTCAATGTGCACTCTTTCCGAACAACTTGATTACT^TCGAGTCGAA 

GAGAGGTGTCTACTGCTGCCGTGACCATGAGCATGAAATOGCCTGGTTCACTGAGCGCTCTGATAAGAGCT 

ACGAGCACCAGACACCCTTCGAAATTAAGAGTGCCTVAGAAATTTGACACTTTCa^ 

TTTGTGTTTCCTCTTAACTCAAAAGTCy^GTCATTCAACCACGTGTTGAAAAGAAAAA 

CATGGGGCGTATACGCTCTGTGTACCCTGTTGCATCTCCACAGGAGTGTAACAATATGCACTTGTCTAC^ 

TGATGAAATGTAATCATTGCGATGAAGTTTCATGGCAGACGTGCGACTTTCTGAAAGCCACTTGTGAACAT 

TGTGGCACTGAAAATTTAGTTATTGAAGGACCTACTACATGTGGGTACCTACCTACTAATGCTGTAGTGAA 

AATGCCATGTCCTGCCTGTCAAGACCCAGAGATTCGACCTGAGCATAGTGTTGCAGATTATCAC^ 

CAAACATTGAAACTCGACTCCGCAAGGGAGGTAGGACTAGATGTTTTCGAGGCTGTGTGTTTGCCTA^ 

GGCTGCTATAATAAGCGTGCCTACTGGGTTCCTCGTGCTAGTGCTGATATTGGCTCAGGCCATACTC 

TACTGGTGACAATGTGGAGACCTTGAATGAGGATCTCCTTGAGATACTGAGTCGTGAACGTC 

ACATTCTTGGCGATTTTCATTTGAATGAAGAGGTTGCCyiTCATTTTGGCATCTOT 

GCCTTTATTGACACTATAAAGAGTCTTGATTACAAGTCTTTCAAAACCATTGTTGAGTCCTGCGGTAACTA 

TAAAGTTACCAAGGGAAAGCCCGTAAAAGGTGCTTGGAACATTGGACAACAGAGATCAGTTTTAACACCAC 

TGTGTGGTTTTCCCTCACAGGCTGCTGGTGTTATCAGATCAATTTTTGCGCGCACACTTGAT<^ 

CACTCAATTCCTGATTTGCAAAGAGCAGCTGTCACCATACTTGATGGTATTTCTGAACAGTCATTACGTCT 

TGTCGACGCCATGGTTTATACTTCAGACCTGCTCACCAACAGTGTCATTATTATGGCATATGT^ 

GTCTTGTACAACAGACTTCTCAGTGGTTGTCTAATCTTTTGGGCACTACTGTTGAAAAACTCy^GG 

TTTGAATGGATTGAGGCGAAACTTAGTGCAGGAGTTGAATTTCTCAAGGATCCTTGGGAGATTC 

TCTCATTACAGGTGTTTTTGACATCGTCAAGGGTCAAATACAGGTTGCTTCAGATAACATCAAGGATTGTG 

TAAAATGCTTCATTGATGTTGTTAACAAGGCACTCGAAATGTGCATTGATCAAGTCACTATCGCTGGCGCA 

AAGTTGCGATCACTCAACTTAGGTGAAGTCTTCATCGCTCAAAGCAAGGGACTTTACCGTCAGXGTATACG 

TGGCAAGGAGCAGCTGCAACTACTCATGCCTCTTAAGGCACCAAAAGAAGTAACCTTTCTTGAAGGTGATT 

CACATGACACAGTACTTACCTCTGAGGAGGTTGTTCTCAAGAACGGTGAACTCGAAGCACTCGAGACGCCC 

GTTGATAGCTTCACAAATGGAGCTATCGTTGGCACACCAGTCTGTGTAAATGGCCTCATGCTC^ 

TAAGGACAAAGAACAATACTGCGCATTGTCTCCTGGTTTACTGGCTACAAACy^TGTCTTTCC^ 

GGGGTGCACCAATTAAAGGTGTAACCTTTGGAGAAGATACTGTTTGGGAAGTTCAAGGTTACAAGAATGTG 

AGAATCACATTTGAGCTTGATGAACGTGTTGACAAAGTGCTTAATGAAAAGTGCTCTGTCTACACTGTTGA 

ATCCGGTACCGAAGTTACTGAGTTTGCATGTGTTGTAGCAGAGGCTGTTGTGAAGACTTTACAACCAGTTT 

CTGATCTCCTTACCAACATGGGTATTGATCTTGATGAGTGGAGTGTAGCTACATTCTACTTATTTGATGAT 

GCTGGTGAAGAAAACTTTTCATCACGTATGTATTGTTCCTTTTACCCTCCAGATGAGGAAGAAGAGGACGA 

TCCAGAGTGTGAGGAAGAAGAAATTGATGAAACCTGTGAACAO^GTACGGTACAGAGGATGATO 

GTCTCCCTCTGGAATTTGGTGCCTCAGCTGAAACAGTTCGAGTTGAGGAAGAAGAAGAGGAAGACT^^ • 

GATGATACTACTGAGCAATCAGAGATTGAGCCAGAACCAGAACCTACACCTGAAGAACCAGTTAATeAGTT 

TACTGGTTATTTAAAACTTACTGACAATGTTGCCATTAAATGTGTTGACATCGTTAAGGAGGCACAAAGTG 

CTAATCCTATGGTGATTGTAAATGCTGCTAACATACACCTGAAACATGGTGGTGGTGTAGCAGGTGCACTC 

AACAAGGCAACCAATGGTGCCATGCAAAAGGAGAGTGATGATTACATTAAGCTAAATGGCCCTCTTACAGT 

AGGAGGGTCTTGTTTGCTTTCTGGACATAATCTTGCTAAGAAGTGTCTGCATGTTGTTGGACCTAACCTAA 

ATGCAGGTGAGGACATCCAGCTTCTTAAGGCAGCATATGAAAATTTCAATTCACAGGACATC 

CCATTGTTGTCAGCAGGCATATT^GTGCTAAACCACTTCAGTCTTTACAAGTGTGCGTGC^ 

TACACAGGTTTATATTGCAGTCAATGACAAAGCTCTTTATGAGCAGGTTGTCATGGATTATCTTGATAACC 

TGAAGCCTAGAGTGGAAGCACCTAAACAAGAGGAGCCACCAT^CACAGAAGATTCCAAAACTGAGGAGAAA 

TCTGTCGTACAGAAGCCTGTCGATGTGAAGCCAAAAATTAAGGCCTGCATTGATGAGGTTACCA(:7^CACT 

GGAAGAAACTAAGTTTCTTACCAATAAGTTACTCTTGTTTGCTGATATCAATGGTAAGCTTTACCATGATT 

CTCAGAACATGCTTAGAGGTGAAGATATGTCTTTCCTTGAGAAGGATCCACCTTACATGGTAGGTGAro 
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ATCACTAGO^TGATATCACTTGTGTTGTAATACCCTCCAAJ^AAGGCTGGTGGCACTACTGAGAT^ 

AAGAGCTTTGAAGAAAGTGCCAGTTGATGAGTATATAACCACGTACCCTGGACAAGGATGTGCTGGTTATA 

CACTTGAGGAAGCTAAGACTGCTCTTAAGAAATGCAAATCTGCATTTTATGTACTACCTTCAGAAGCACCT 

AATGCTAAGGAAGAGATTCTAGGAACTGTATCCTGGAATTTGAGAGAAATGCTTGCTCATGCTGAAGAGAC 

AAGAAAATTAATGCCTATATGCATGGATGTTAGAGCCATAATGGCAACCATCCAACGTAAGTATAAAGGAA 

TTAAAATTCAAGAGGGCATCGTTGACTATGGTGTCCGATTCTTCTTTTATACTAGTAAAGAGCCTC 

TCTATTATTACGAAGCtfGAACTCTCTAAATGAGCCGCTTGTCACAATGCC/A 

TTTTAATCTTGAAGAGGCTGCGCGCTGTATGCGTTCTCTTAAAGCTCCTGCCXSTAGTGTCAGTAl^ 

CAGATGCTGTTACTACATATAATGGATACCTCACTTCGTCATCAAAGACATCTGAGGAGCACTTTGTAGAA 

ACAGTTTCTTTGGCTGGCTCTTACAGAGATTGGTCCTATTCAGGACAGCGTACAGAGTTAGGTGTTGAATT 

TCTTAAGCGTGGTGACAAAATTGTGTACCACACTCTGGAGAGCCCCGTCGAGTTTCATCTTGACGGTGAGG 

TTCTTTCACTTGACAAACTAAAGAGTCTCTTATCCCTGCGGGAGGTTAAGACTATAAAAGTGTTCACAACT • • 

GTGGACAACACTAATCTCCACACACAGCTTGTGGATATGTCTATGACATATGGACAGCAGTTTG6TCCAAC 

ATACTTGGATGGTGCTGATGTTACAAAAATTAi^CCTCATGTAAATCATGAGGGTAAGACT^ 

TACCTAGTGATGACACACTACGTAGTGAAGCTTTCGAGTACTACCATACTCTTGATGAGAGTTTTCO^^ 

AGGTACATGTCTXSCTTTAAACCACACAAAGAAATGGAAATTTCCTCAAGTTGGTGGTTTAACTTCi^ 

ATGGGCTGATAACAATTGTTATTTGTCTAGTGTTTTATTAGCACTTCAACAGCTTGAAGTCAAATTCAATO 

CACCAGCACTTCAAGAGGCTTATTATAGAGCCCGTGCTGGTGATGCTGCTAACTTTTGTGCACTCATACTC 

GCTTACAGTAATAAAACTGTTGGCGAGCTTGGTGATGTCAGAGAAACTATGACCCATCTTCTACAGCATGC 

TAATTTGGAATCTGCAAAGCGAGTTCTTAATGTGGTGTGTAAACATTGTGGTCAGAA^ 

CGGGTCTAGAAGCTGTGATGTATATGGGTACTCTATCTTATGATAATCTTAAGACAGGTGTTTCCATT^^ 

tgtgtgtgtggtcgtgAtgctacacaatatctagtacaacaagagtcttcttttgttatg^ 

ACCTGCTGAGTATAAATTACAGCAAGGTACATTCTTATGTGCGAATGAGTACACTGGTAACTATCAGTGTG 

gtcattacactcatataactgctaaggagaccctctatcgtattgacggagctcaccttacaaagatgtca 

GAGTACAAAGGACCAGTGACTGATGTTTTCTACT^GGAAACATCTTACACTACAACCATCAAGCCTGTGTC 

gtataaactcgatggagttacttacacagagattgaaccaaaattggatgggtattataaaaaggataatg 

CTTACTATACAGAGCAGCCTATAGACCTTGTACCAACTCAACCATTACCAAATGCGAGTTTTGATAATTTC 

aaactcacatgttctaacacaaaatttgctgatgatttaaatcaaatgacaggcttc 

ACGAGAGCTATCTGTCACATTCTTCCCAGACTTGAATGGCGATGTAGTGGCTATTGACTATAGACACTATfl^ 

cagcgagtttcaagaaaggtgctaaattactgcataagccaattgtttggcacattaaccagg 

AAGACAACGTTCAAACCAAACACTTGGTGTTTACGTTGTCTTTGGAGTACAAAGCCAGTAGATACTTCAAA 

ttcatttgaagttctggcagtagaagacacacaaggaatggacaatcttgcttgtgaaagtcaacaaccca 

CCTCTGAAGAAGTAGTGGAAAATCCTACCATACAGAAGGAAGTCATAGAGTGTGACGTGAAAACTACCGAA 
GTTGTAGGCAATGTCATACTTAAACCATCAGATGAAGGTGTTAAAGTAACACAAGAGTTAGGTCATGAG(^ 
TCTTATGGCTGCTTATGTGGTlAAACACaAGCATTACCATTAAGAAACCTAATGAGCTTTCACTA^ 

gtttaaaaacaattgccactcatggtattgctgcaattaatagto 

gtcaaaccattcttaggacaagcagcaattacaacy^tcaaattgcgctaagagattagcacaacgtgtgto 
taacaattatatgccttatgtgtttacattattgttccaattgtgtacttttactaaaagtaccaattcta 

GAATTAGAGCTTCACTACCTACAACTATTGCTAAAAATAGTGTTAAGAGTGTTGCTAAATTATGTTTGGAT 

GCCGGCATTAATTATGTGAAGTCACCCAAATTTTCTAAATTGTTCACAATCGCTATGTGGCTATTGTTGTT 

AAGTATTTGCTTAGGTTCTCTAATCTGTGTAACTGCTGCTTTTGGTGTACTC'ITATCTAATTTT^ 

CTTCTTATTGTAATGGCGTTAGAGAATTGTATCTTAATTC6TCTAACGTTACTACTATGGATTTCTGTGAA 

GGTTCTTTTCCTTGCAGCATTTGTTTAAGTGGATTAGACTCCCTTGATTCOT 

TCAGGTGACGATTTCATCGTACAAGCTAGACTTGACAATTTTAGGTCTGGCCGCTGAGTGGGTTTTGGCAT 
ATATGTTGTTCACAAAATTCTTTTATTTATTAGGTCTTTCAGCTATAATGCAGGTGTTCTTTGGCTATTCT 
GCTAGTCATTTCATCAGCAATTCTTGGCTCATGTGGTTTATCATTAGTATTGTACAAATGGCACCCGTTTC 
TGCAATGGTTAGGATGTACATCTTCTTTGCTTCTTTCTACTACATATGGAAGAGCTATGTTCATATCATGG 
ATGGTTGCACCTCTTCGACTTGCATGATGTGCTATAAGCGCAATCGTGCCACACGCGTTGAGTGTACAACT 
ATTGTTAATGGCATGAAGAGATCTTTCTATGTCTATGCAAATGGAGGCCGTGGCTTCTGCAAGACTCACAA 

TTGGAATTGTCTCAATTGTGACACATTTTGCACTGGTAGTACATTCATTAGTGATG^ 

TGTCACTCCAGTTTAAAAGACCAATCAACCCTACTGACCAGTCATCGTATATTGTTGATAGT6TTGCTC 

AAAAATGGCGCGCTTCACCTCTACTTTGACAAGGCTGGTCAAAAGACCTATGAGAGACATCCGCTCTCCCA 

TTTTGTCAATTTAGACAATTTGAGAGCTAACAACACTAAAGGTTCACTGCCTATTAATGTCATAGTTTTTG 

ATGGCAAGTCCAAATGCGACGAGTCTGCTTCTAAGTCTGCTTCTGTGTACTACAGTCAGCTGATGTGCCAA 

CCTATTCTGTTGCTTGACCAAGCTCTTGTATCAGACGTTGGAGATAGTACTGAAGTTTCCGTTAAGATGTT 

TGATGCTTATGTCGACACCTTTTCAGCAACTTTTAGTGTTCCTATGGAAAAA 
CAGCTCACAGCGAGTTAGCAAAGGGTGTAGCTTTAGATGGTGTCCTTTCTACATTCGTGTCAGCTC^ 
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CAAGGTGTTGTTGATACCGATGTTGACACAAAGGATGTTATTGAATGTCTCAAACTT^^ 

CTTAGAAGTGACAGGTGACAGTTGTAACAATTTCATGCTCACCTATAATAAGGTTGAAAACATGACGCCCA 

GAGATCTTGGCGCATGTATTGACTGTAATGCAAGGCATATCAATGCCCAAGTAGCAAAAAGTCACAATGTT 

TCACTCATCTGGAATGTAAAAGACTACATGTCTTTATCTGAACAGCTGCGTAAACAAATTCGTAGTGCTGC 

CAAGAAGAACAACATACCTTTTAGACTAACTTGTGCTACAACTAGACAGGTTGTCAATGTCATAA 

AAATCTCACTCAAGGGTGGTAAGATTGTTAGTACTTGTTTTAAACTTATGCTTAAGGCCAC^ 

GTTCTTCCTGCATTGGTTTGOrrATATCGTTATGCCAGTACATACATTGTCAATCC^ 

TGAAATCATTGGTTACAAAGCCATTCAGGATGGTGTCACTCGTGACAO^TTTCTAQpGAT^ . 

CAAATAAACATGCTGGTTTTGACGCATGGTTTAGCCAGCGTGGTGGTTCATACAAAAATGACAAA^ 

CCTGTAGTAGCTGCTATCATTACAAGAGAGATTGGTTTCATAGTGCCTGGCTTACCGGGTACTGTGCTGAG 

AGCAATCAATGGTGACTTCTTGCATTTTCTACCTCGTGTTTTTAGTGCTGTTGGCAACATTTGCTAC^ 

CTTCCAAACTCATTGAGTATAGTGATTTTGCTACCTCTGCTTGCGTTCTTGCTGCTGAGTGTAC^ 

AAGGATGCTATGGGCAAACCTGTGCCATATTGTTATGACACTAATTTGCTAGAGGGTTCTATTTCTTATAG 

TGAGCTTCGTCCT^GACACTCGTTATGTGCTTATGGATGGTTCCATCATACAGTTTC 

AGGGTTCTGTTAGAGTAGTAACAACTTTTGATGCTGAGTACTGTAGACATGGTACATGCGAAAGGTCAGAA 

GTAGGTATTTGCCTATCTACCAGTGGTAGATGGGTTCTTAATAATGAGCATTACAGAGCTCTATCAGGAGT 

TTTCTGTGGTGTTGATGCGATGAATCTCATAGCTAACATCTTTACTCCTCTTGTOCAACCTGTGGGTGCOT 

TAGATGTGTCTGCTTCAGTAGTGGCTGGTGGTATTATTGCCATATTGGTGACTTGTGCTGCCTACTACTTT 

ATGAAATTCAGACGTGTTTTTGGTGAGTACAACCATGTTGTTGCTGCTAATGCACTTTTGTT^ 

TTTCACTATACTCTGTCTGGTACaVGCTTACAGCTTTCTGCCGGGAGTCTACTC^ 

TGACATTCTATTTCACCAATGATGTrTCATTCTTGGCTCACCTTCAATGGTTTGCCATGTTT^ 

GTGCCTTTTTGGATAACAGCAATCTATGTATTCTGTATTTCTCTGAAGCACTGCCATTGGTTCTTTAAC^ 

CTATCTTAGGAAAAGAGTCATGTTTAATGGAGTTACATTTAGTACCTTCGAGGAGGCTGCTTTGTGTACCT 

TTTTGCTCAACAAGGAAATGTACCTAAAATTGCGTAGCGAGACACTGTTGCCACTTACACAGTATAACAGG 

TATCTTGCTCTATATAACAAGTACAAGTATTTCAGTGGAGCCTTAGATACTACCAGCTATCGTGAAGCAGC 

TTGCTGCCACTTAGCAAAGGCTCTAAATGACTTTAGCAACTCAGGTGCTGATGTTCTCTACCAACCACCAC 

AGACATCAATCACTTCTGCTGTTCTGCAGAGTGGTTTTAGGAAAATGGCATTCCCGTCy^GGCAAAGTT^^ 

GGGTGCy^TGGTACAAGTAACCTGTGGAACTACAACTCTTAATGGATTGTGGTTGGATGACACAGTATACTC . 

TCCAAGACATGTCATTTGCACAGCAGAAGACATGCTTAATCCTAACTATGAAGATCTGCTCAMCGCAAA 

CCAACCATAGCTTTCTTGTTCAGGCTGGCAATGTTCAACTTCGTGTTATTGGCCATTCTATGCAAAATTGT 

CTCCTTAGGCTTAAAGTTGATACTTCTAACCCTAAGACACCCAAGTATAAATTTGTCCGTATCCAACCTC^ 

TCAAACATTTTCAGTTCTAGCATGCTACAATGGTTCACCATCTGGTGTTTATCAGTGTGCCATGAGACCTA 

ATCATACCATTAAAGGTTCTTTCCTTAATGGATCATGTGGTAGTGTTGGTTTTAACATTGATTATGATTGC 

GTGTCTTTCTGCTATATGCATCATATGGAGCTTCCAACAGGAGTACACGCTGGTACTGACTTAGAAGGTAA 

ATTCTATGGTCCATTTGTTGACAGACAAACTGCACAGGCTGCAGGTACAGACACAACC^ 

TTTTGGCATGGCTGTATGCTGCTGTTATCTU^TGGTGATAGGTGGTTTCTTAATAGAT^ 

AATGACTTTAACCTTGTGGCAATGAAGTACAACTATGAACCTTTGACACAAGATCATGTTGACATATTGGG 

ACCTCTTTCTGCTCAAACAGGAATTGCCGTCTTAGATATGTGTGCTGCTTTGAAAGAGCTGCTGCAGAATG 

GTATGAATGGTCGTACTATCCTTGGTAGCACTATTTTAGAAGATGAGTTTACACCATTTGATGTTGTTAGA 

CAATGCTCTGGTGTTACCTTCCAAGGTAAGTTCAAGAAAATTGTTAAGGGCACTCATCATTGGATGCTT^ 

AACTTTCTTGACATCACTATTGATTCTTGTTCAAAGTACACAGTGGTCACTGTTTTTCTTTGTTTACGAGA 

ATGCTTTCTTGCCATTTACTCTTGGTATTATGGCAATTGCTGCATGTGCTATGCTGCTTGTTAAGCATJ^ 

CACGCATTCTTGTGCTTGTTTCTGTTACCTTGTCTTGCAACAGTTGCTTACTTTAATATGGTCTACAT^ 

TGCTAGCTGGGTGATGCGTATCATGACATGGCTTGAATTGGCTGACACTAGCTTGTCTGGTTATAGGCTTA 

AGGATTGTGTTATGTATGCTTCAGCTTTAGTTTTGCTTATTCTCATGACAGCTCGCACTGTTTATGATGAT 

GCTGCTAGACGTGTTTGGACACTGATGAATGTCATTACACTTGTTTACAAAGTCTACTATGGTAATGCTTT 

AGATCAAGCTATTTCCATGTGGGCCTTAGTTATTTCTGTAACCTCTAACTATTCTGGTGTCGTTACGACTA 

TCATGTTTTTAGCTAGAGCTATAGTGTTTGTGTGTGTTGAGTATTACCCATTGTTATTTATTACTGGCAAC 

ACCTTACAGTGTATCATGCTTGTTTATTGTTTCTTAGGCTATTGTTGCTGCTGCTACTTTGGCCO^^ 

TTTACTCAACCGTTACTTCAGGCTTACTCTTGGTGTTTATGACTACTTGGTCTCTACACAAGAATTTAGGT 

ATATGAACTCCCAGGGGCTTTTGCCTCCTAAGAGTAGTATTGATGCTTTCAAGCTTAACATTAAGTTGTTG 

GGTATTGGAGGTAAACCATGTATCAAGGXTGCTACTGTACAGTCTAAAATGTCTGACGTAAAGTGCACATC 

TGTGGTACTGCTCTCGGTTCTTCAACAACTTAGAGTAGAGTCATCTTCTAAATTGTGGGCACAATGTGTAC 

AACTCCACAATGATATTCTTCTTGCAAAAGACACAACTGAAGCTTTCGAGAAGATGGTTTCTCTTTTGTCT 

GTTTTGCTATCCATGCAGGGTGCTGTAGACATTAATAGGTTGTGCGAGGAAATGCTCGATAACCGTGCT^ 

TCTTCAGGCTATTGCTTCAGAATTTAGTTCTTTACCATCATATGCCGCTTATGCCACTGCCCAO 

ATGAGCAGGCTGTAGCTAATGGTGATTCTCAAGTCGTTCTCAAAAAGTTAAAGAAATCTT^ 
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AAATCTGAGTTTGACCGTGATGCTGCCATGCAACGCAAGTTGGAAAAGATGGCAGATCAG^ 

AATGTACAAACAGGCAAGATCTGAGGACAAGAGGGCAAAAGTAACTAGTGCTATGCAAACAATGCTCTTC^ 

CTATGCTTAGGAAGCTTGATAATGATGCACTTAACAACATTATCAACAATGCGCGTGATGGTTGTGTTCCA 

CTCAACATCATACCATTGACTACAGCAGCCAAACTCATGGTTGTTGTCCCTGATTATGGTACCTACAAGAA 

CACTTGTGATGGTAACACCTTTACATATGCATCTGCACTCTGGGAAATCCAGCAAGTTGTTGATGCGGATA 

GCAAGATTGTTCAACTTAGTGAAATTAACATGGACAATTCACCAAATTTGGCTTGGCCTCTTATTG^^ 

GCTCTAAGAGCCAACTCAGCTGTTAAACTACAGAATAATGAACTGAGTCCAGTAGCACTACGACAGATGTC 

CTGTGCGGCTGGTACCACACAAACAGCTTGTACTGATGACAATGCACTTGCCTACTApAACAAT^ . 

GAGGTAGGTTTGTGCTGGCATTACTATCAGACCACCAAGATCTCAAATGGGCTAGATTTC 

GGTACAGGTACAATTTACACAGAACTGGAACCACCTTGTAGGTTTGTTACAGACACACCAAAAGGGCCTAA 

AGTGAAATACTTGTACTTCATCAAAGGCTTAAACAACCTAAATAGAGGTATGGTGCTGGGCAGTTTAGCrc 

CTACAGTACGTCTTCAGGCTGGAAATGCTACAGAAGTACCTGCCAATTCAACTGTGCTTTCCTTCTGTGCT 

TTTGCAGTAGACCCTGCTAAAGCATATAAGGATTACCTAGCAAGTGGAGGACAACCAATCACCAACTGTC 

GAAGATGTTGTGTACACACACTGGTACAGGACAGGCAATTACTGTAACACCAGAAGCTAACATGGACCA^ 

AGTCCTTTGGTGGTGCTTCATGTTGTCTGTATTGTAGATGCa^CATTGACC^ 

TGTGACTTGAAAGGTAAGTACGTCCAAATACCTACCACTTGTGCTAATGACCCAGTGGGTTTTACACOT^M 

AAACACAGTCrGTACCGTCTGCGGAATGTGGAAAGGTTATGGCTGTAGTTGTGACCAACTCCGCGAACCCT 

TGATGCAGTCTGCGGATGCATCAACGTTTTTAAACGGGTTTGCGGTGTAAGTGCAGCCCGTCTTACACCGT 

GCGGCACAGGCACTAGTACTGATGTCGTCTACAGGGCTTTTGATATTTACAACGAAAAAGTTGCTGGTTTT 

GCAAAGTTCCTAAAAACTAATO6CTGTCGCTTCCAGGAGAAGGATGAGGAAGGCAATTTATTAGACTCTTA 

CTTTGTAGTTAAGAGGCATACTAaXSTCTAACTACCAACATGAAGAGACTATTTATAACTTGGOT 

GTCCAGCGGTTGCTGTCCATGACTTTa?TCAAGTTTAGAGTAGATGGTGACATGGTACCACATATATCAC^ 

CAGCGTCTAACTAAATACACAATGGCTGATTTAGTCTATGCTCTACGTCATTTTGATGAGGGTAATTGTC^ 

TACATTAAAAGAAATACTCGTCACATACAATTGCTGTGATGATGATTATTTCAATAAGAAGGATTGGTATG 

ACTTCGTAGAGAATCCTGACATCTTACGCGTATATGCTAACTTAGGTGAGCGTGTACGCCAATCATTATTA 

AAGACTGTACAATTCTGCGATGCTATGCGTGATGCAGGCATTGTAGGCGTACTGACATTAGATAATCAGGA 

TCTTAATGGGAACTGGTACGATTTCGGTGATTTCGTACAAGTAGCACCAGGCTGCGGAGTTCCTATTGTGG 

ATTCATATTACTCATTGCTGATGCCCATCCTCACTTTGACTAGGGCATTGGCTGCTGAGTCCCATATGGAT . 

GCTGATCTCGCAAAACCACTTATTAAGTGGGATTTGCTGAAATATGATT^ 

CTTCGACCGTTATTTTAAATATTGGGACCAGACATACCATCCCAATTGTATTAACTGTTTGGATGATAGGT 

GTATCCTTCATTGTGCAAACTTTAATGTGTTATTTTCTACTGTGTTTCCACCTACAAGTTTTGGACCACTA. 

GTAAGAAAAATATTTGTAGATGGTGTTCCTTTTGTTGTTTCAACTGGATACCATTTTCGTGAGTTAGGAGT 

CGTACATAATCAGGATGTAAACTTACATAGCTCGCGTCTCAGTTTCAAGGAACTTTTAGTGTATGCTGCTG 

ATCCAGCTATGCATGCAGCTTCTGGCAATTTATTGCTAGATAAACGCACTACATGCTTTTCAGTAGCTGCA 

CTAACAAACAATGTTGCTTTTCAAACTGTCAAACCCGGTAATTTTAATAAAGACTTTTATGACTTTGCTGT 

GTCTAAAGGTTTCTTTAAGGAAGGAAGTTCTGTTGAACTAAAACACTTCTTCTTTK 

CTGCTATCAGTGATTATGACTATTATCGTTATAATCTGCCAACAATGTGTGATATCAGACAACTCCTA^^ 

GTAGTTGAAGTTGTTGATAAATACTTTGATTGTTACGATGGTGGCTGTATTAATGCCAACCAAGTAATCGT 

TAACAATCTGGATAAATCAGCTGGTTTCCCATTTAATAAATGGGGTAAGGCTAGACTTTATTATGACTCAA 

TGAGTTATGAGGATCAAGATGCACTTTTCGCGTATACTAAGCGTAATGTCATCCCTACTATAACTCAAATG 

AATCTTAAGTATGCCATTAGTGCAAAGAATAGAGCTCGCACCGTAGCTGGTGTCTCTATCTGTAGTACTAT 

GACAAATAGACAGTTTCATCAGAAATTATTGAAGTCAATAGCCGCCACTAGAGGAGCTACTGTGGTAATT6 

GAACAAGCAAGTTTTACGGTGGCTGGCATAATATGTTAAAAACTGTTTACAGTGATGTAGAAAC^ 

CTTATGGGTTGGGATTATCCAAAATGTGACAGAGCCATGCCTAACATGCTTAGGATAATGGCCTC 

TCTTGCTCGCAAACATAACACTTGCTGTAACTTATCACACCGTTTCTACAGGTTAGCTAACGAGTGT^ 

AAGTATTAAGTGAGATGGTCATGTGTGGCGGCTCACTATATGTTAAACCAGGTGGAACATCATCCGGTGAT 

GCTACAACTGCTTATGCTAATAGTGTCTTTAACATTTGTCAAGCTGTTACAGCCAATGTAAATGCACTTCT 

TTCAACTGATGGTAATAAGATAGCTGACAAGTATGTCCGCAATCTACAACACAGGCTCTATGAGTGTCTCT 

ATAGAAATAGGGATGTTGATCATGAATTCGTGGATGAGTTTTACGCTTACCOKKrGTAAACATO 

ATGATTCTTTCTGATGATGCCGTTGTGTGCTATAACAGTAACTATGCGGCTCT^GGTTTAGTAGCT^ 

TAAGAACTTTAAGGCAGTTCTTTATTATCAAAATAATGTGTTCATGTCTGAGGCAAAA 

CTGACCTTACTAAAGGACCTCACGAATTTTGCTCACAGCATACAATGCTAGTTAAACAAGGAGATGATTA^ 

GTGTACCTGCCTTACCCAGATCCATCAAGAATATTAGGCGCAGGCTGTTTTGTCGATGATATTGTCAAAAC 

AGATGGTACACTTATGATTGAAAGGTTCGTGTCACTGGCTATTGATGCTTACCCACTTACAAAACATCCTA 

ATCAGGAGTATGCTGATGTCTTTCACTTGTATTTACAATACATTAGAAAGTTACATGATGAGCTTACTGGC 

CACATGTTGGACATGTATTCCGTAATGCTAACTAATGATAACACCTCACGGTACTGGGAACCTGAGTTTTA 

TGAGGCTATGTACACACCACATACAGTCTTGCAGGCTGTAGGTGCTTGTGTATTGTGCAATTCAC^ 
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CACTTCGTTGCGGTGCCTGTATTAGGAGACCATTCCTATGTTGCAAGTGCTGCTATGACCATC 
ACATCACACAAATTAGTGTTGTCTGTTAATCCCTATGTTTGCAATGCCCCAGGT 

GACACAACTGTATCTAGGAGGTATGAGCTATTATTGCAAGTCACATAAGCCTCCCATTAGTTTTCCATTAT 

GTGCTAATGGTCAGGTTTTTGGTTTATACAAAAACACATGTGTAGGCAGTGACAATGTCACTGACTTCAAT 

GCGATAGCAACATGTGATTGGACTAATGCTGGCGATTACATACTTGCCAACACTTGTACTGAGAGACTCAA 

GCTTTTCGCAGCAGAAACGCTCAAAGCCACTGAGGAAACATTTAAGCTGTCATATGGTATTGCCACTGTAC 

GCGAAGTACTCTCTGACAGAGAATTGCATCTTTCATGGGAGGTTGGAAAACCTAGACCACCATTC 

AACTATGTCTTTACTGGTTACCGTGTAACTAAAAATAGTAAAGTACAGATTGGAGAGTACACCTTTGAAAA . 

AGGTGACTATGGTGATGCTGTTGTGTACAGAGGTACTACGACATACAAGTTGAAT6TTGGTGATTACTTTO 

TGTTGACATCTCACACTGTAATGCCACTTAGTGCACCTACTCTAGTGCCACAAGAGCACTAT6TGAGAATT 

ACTGGCTTGTACCCAACACTCAACATCTCAGATGAGTTTTCTAGCAATGTTGCAAATTATCAAAAGGTCGG 

CATGCAAAAGTACTCTACACTCCAAGGACCACCTGGTACTGGTAAGAGTCATTTTGCCATCGGACTTGCTC . . 

TCTATTACCCATCTGCTCGCATAGTGTATACGGCATGCTCTCATGCAGCTGTTGATGCCCTATGTGAAAAG 

GCATTAAAATATTTGCCCATAGATAAATGTAGTAGAATCATACCTGCGCGT(^GCGCGTAGAGTGTTTT^ 

TAAATTCAAAGTGAATTCAACACTAGAACAGTATGTTTTCTGCACTGTAAATGCATTGCCAGAAACAACTG 

CTGACATTGTAGTCTTTGATGAAATCTCTATGGCTACTAATTATGACTTGAGTGW 

CGTGCAAAACACTACGTCTATATTGGCGATCCTGCTCAATTACCAGCCCCCCGCACATTGCTGACTAAAGG 

CACACTAGAACCAGAATATTTTAATTCAGTGTGCAGACTTATGAAAACAATAGGTCCAGACATGTTCCTTG 

GAACTTGTCGCCGTTGTCCTGCTGAAATTGTTGACACTGTGAGTGCTTTAGTTTATGACAATAAGCTAAAA 

GCACACAAGGATAAGTCAGCTCAATGCTTCAAAATGTTCTACAAAGGTGTT^TTACACATGATGTTTCATC 

TGCAATCAACAGACCTCAAATAGGCGTTGTAAGAGAATTTCTTACACGCAATCCTGCTTGGAGAAAAGCTG 

TTTTTATCTCACCTTATAATTCACAGAACGCTGTAGCTTCAAAAATCTTAGGATTGCCTACGCAGACTGTT 

GATTCATCACAGGGTTCTGAATATGACTATGTCATATTCACACTU^CTACTGAAACAGCACACTCT 

TGTCAACCGCTTCAATGTGGCTATCACAAGGGCAAAAATTGGCATTTTGTGCATAATGTCTGAT^ 

TTTATGACAAACTGCAATTTACT^GTCTAGAAATACCACGTCGCAATGTGGCTACATTACAAGCAGAAAAT 

GTAACTGGACTTTTTAAGGACTGTAGTAAGATCATTACTGGTCTTCATCCTACACAGGCACCTACACACCT 

CAGCGTTGATATAAAGTTCAAGACTGAAGGATTATGTGTTGACATACCAGGCATACCAAAGGACATGACCT 

ACCGTAGACTCATCTCTATGATGGGTTTCAAAATGAATTACCAAGTCAATGGTTACCCTAATATGTTTATC 

ACCCGCGAAGAAGCTATTCGTCACGTTCGTGCGTGGATTGGCTTTGATGTAGAGGGCTGTCATGCAACTAG 

AGATGCTGTGGGTACTAACCTACCTCTCCAGCTAGGATTTTCTACAGGTGTTAACTTAGTAGCTGTACCGA 

CTGGTTATGTTGACACTGAAAATAACACAGAATTCACCAGAGTTAATGCAAAACCTCCACCAGGTGACCAG 

TTTAAACATCTTATACCACTCATGTATAAAGGCTTGCCCTGGAATGTAGTGCGTATTAAGATAGTACAAAT 

GCTCAGTGATACACTGAAAGGATTGTCAGACAGAGTCGTGTTCGTCCTTTGGGCGCATGGCTTTGAGCTTA 

CATCAATGAAGTACTTTGTCAAGATTGGACCTGAAAGAACGTGTTGTCTGTGTGACAAACGTGCAACTTGC 

TTTTCTACTTCATCAGATACTTATGCCTGCTGGAATCATTCTGTGGGTTTTGACTATGTCTATAACCCATT 

TATGATTGATGTTCAGCAGTdGGGCTTTACGGGTAACCTTCAGAGTAACCATGACCAACATTGCCAGGTAC 

ATGGAAATCCACATGTGGCTAGTTGTGATGCTATCATGACTAGATGTTTAGCAGTCCATGAGTGC 

AAGCGCGTTGATTGGTCTGTTGAATACCCTATTATAGGAGATGAACTGAGGGTTAATTCTGCT 

AGTACAACACATGGTTGTGAAGTCTGCATTGCTTGCTGATAAGTTTCCAGTTCTTCATGACATTGGAAATC 

CAAAGGCTATCAAGTGTGTGCCTCAGGCTGAAGTAGAATGGAAGTTCTACGATGCTCAGCCATGTAGTGAC 

AAAGCTTACAAAATAGAGGAACTCTTCTATTCTTATGCTACACATCACGATAAATTCACTGATGGTGTTTG 

TTTGTTTTGGAATTGTAACGTTGATCGTTACCCAGCCAATGCAATTGTGTGTAGGTTTGACACAAGAGTCT 

TGTCAAACTTGAACTTACCAGGCTGTGATGGTGGTAGTTTGTATGTGAATAAGCATGCATTCCACACTCCA 

GCTTTCGATAAAAGT6CATTTACTAATTTAAAGCAATTGCCTTTCTTTTACTATTC 

GTCTCATGGCAAACAAGTAGTGTCGGATATTGATTATGTTCCACTCAAATCTGCTACGTC 

GCAATTTAGGTGGTGCTGTTTGCAGACACCATGCAAATGAGTACCGACAGTACTTGGATGCATATAATATG 

ATGATTTCTGCTGGATTTAGCCTATGGATTTACAAACAATTTGATACTTATAACCTGTGGAATACATTTAC 

CAGGTTACAGAGTTTAGAAAATGTGGCTTATAATGTTGTTAATAAAGGACACTTTGATGGACACGCCGGCG 

AAGCACCTGTTTCCATCATTAATAATGCTGTTTACACAAAGGTAGATGGTATTGATGTGGAGATCTTTGAA 

AATAAGACAACACTTCCTGTTAATGTTGCATTTGAGCTTTGGGCTAAGCGTAACATTAAACCAGTG^ 

GATTAAGATACTCAATAATTTGGGTGTTGATATCGCTGCTAATACTGTAATCTGGGACTACy^AAAG^^ 

CCCCAGCACATGTATCTACAATAGGTGTCTGCACAATGACTGACATTGCCAAGAAACCTACTGAGAGTGCT 

TGTTCTTCACTTACTGTCTTGTTTGATGGTAGAGTGGAAGGACAGGTAGACCTTTTTAGAAACGCCCGTAA 

TGGTGTTTTAATAACAGAAGGTTCAGTCAAAGGTCTAACACCTTCAAAGGGACCAGCACAAGCTAGCGTCA 

ATGGAGTCACATTAATTGGAGAATCAGTAAT^CACAGTTTAACTACTTTAAGAAAGTAGACGGCATTATT 

C/^CAGTTGCCTGAAACCTACTTTACTCAGAGCAGAGACTTAGAGGATTTTAAGCCCAGATCACAAATGGA 

AACTGACTTTCTCGAGCTCGCTATGGATGAATTCATACAGCGATATAAGCTCGAGGGCTATGCCTTCGJ^ 



FIGURE 3M 



10/555073 



WO 2004/096842 PCT/CA2004/000626 

19/55 

ACATCGTTTATGGAGATTTCAGTCATGGACAACTTGGCGGTCTTCATTTTVATGATAGGCTTAGCCAAGC^ 

TCAC?^GATTCACCACTTAAATTAGAGGATTTTATCCCTATGGACAGCACAGTGAAAAATTACTTCATAAC 

AGATGCGCAAACAGGTTCATCAAAATGTGTGTGTTCTGTGATTGATCTTTTACTTGATGACTTTGTCGAGA 

TAATAAAGTCACAAGATTTGTCAGTGATTTCAAAAGTGGTCAAGGTTACAATTGACTATGCTGAAATT^^ 

TTCATGCTTTGGTGTAAGGATGGACATGTTGAAACCTTCTACCCAAAACTACAAGCAAGTCAAGCGTGGCA 

ACCAGGTGTTGCGATGCCTAACTTGTACAAGATGCAAAGAATGCTTCTTGAAAAGTdTGACCTTCAGAA 

ATGGTGAAAATGCTGTTATACCAAT^GGAATAATGATGAATGTCGCAAAGTATACTCAACTGTG^^ 

TTAAATACACTTACTTTAGCTGTACCCTACAACATGAGAGTTATTCACTTTGGTGCTGGCTCTGATAAAGC3 

AGTTGCACCAGGTACAGCTGTGCTCAGACAATGGTTGCCAACTGGCACACTACTTGTCGATTCAGATCT^ 

ATGACTTCGTCTCCGACGCAGATTCTACTTTAATTGGAGACTGTGCAACAGTACATACGGCTAATAAATGG 

GACCTTATTATTAGCGATATGTATGACCCTAGGACCAAACATGTGACAAAAGAGAATGACTCTAAAGAAGG 

GTTTTTCACTTATCTGTGTGGATTTATAAAGCAAAAACTAGCCCTGGGTGGTTCTATAGCTGTAAAGATAA 

CAGAGCATTCTTGGAATGCTGACCTTTACAAGCTTATGGGCCATTTCTCATGGTGGACAGCTTa^ 

AATGTAAATGCATCATCATCGGAAGCATTTTTAATTGGGGCTAACTATCTTGGCAAGC^ 

TGATGGCTATACCATGCATGCTAACTACATTTTCI^AGGAACACAAATCCTATCCAGTTGTCTTC 

CACTCTTTGACATGAGCAAATTTCCTCTTAAATTAAGAGGAACTGCTGTAATGTCTCTTAAGGAGAATC/^ 

ATCAATGATATGATTTATTCTCTTCTGGAAAAAGGTAGGCTTATCATTAGAGAAAACAACAGAGTTGTGGT 

TTCAAGTGATATTCTTGTTAACAACTAAACGAACATGTTTATTTTCTTATTATTTCTTACTCTCACTAGTG 

GTAGTGACCTTGACCGGTGCACCACTTTTGATGATGTTCAAGCTCCTAATTACACTCAACATAC^^ 

ATGAGGGGGGTTTACTATCCTGATGAAATTTTTAGATCAGACACTCTTTATTTAACTCA 

TCCATTTTATTCTAATGTTACAGGGTTTCATACTATTAATCATACGTTTGGCAACCr 

AGGATGGTATTTATTTTGCTGCCACAGAGAAATCAAATGTTGTCCGTOGTTGGGTTTTTTC 

AACAACAAGTCACAGTCGGTGATTATTATTAACAATTCTACTAATGTTGTTATACGAGCATGTAACTTTGA 

ATTGTGTGACAACCCTTTCTTTGCTGTTTCTAAACCCATGGGTACACAGACACATACTATGATATTCGATA 

ATGCATTTAATTGCACTTTCGAGTACATATCTGATGCCTTTTCGCTTGATGTTTCAGAAAAGTCAGGTAAT 

TTTAAACACTTACGAGAGTTTGTGTTTAAAAATAAAGATGGGTTTCTCTATGTTTATAAGGGCTATCAACC 

TATAGATGTAGTTCGTGATCTACCTTCTGGTTTTAACACTTTGA7ACCTATTTTTAAGTTGCCTCTTGGTA 

TTAACATTACAAATTOTAGAGCCATTCTTACAGCCTTTTCACCTGCTCAAGACATTTC 

GCAGCCTATTTTGTTGGCTATTTAAAGCCAACTACATTTATGCTCAAGTATGATGA7VAATGGTACAATCAC 

AGATGCTGTTGATTGTTCTCAAAATCCACTTGCTGT^CTCAJ^TGCTCTOTTAAGAGCTTO 

AAGGAATTTACCAGACCTCTAATTTCAGGGTTGTTCCCTCAGGAGATGTTGTGAGATTCCCTAATATTACA. 

AACTTGTGTCCTTTTGGAGAGGTTTTTAATGCTACTAAATTCCCTTCTGTCTATGCATGGGAGAGAAAAAA 

AATTTCTAATTGTGTTGCTGATTACTCTGTGCTCTACAACTCAACATTTTTTTCAACCTTTT^GTGCTATG 

GCGTTTCTGCCACTAAGTTGAATGATCTTOXjCTTCTCCAATGTCTATGCAGATTCTTTTGTAGTCAAGGGA 

GATGATGTAAGACAAATAGCGCCAGGACAAACTGGTGTTATTGCTGATTATTATTATAAATTGCCAGAO^ 

tttcatgggttgtgtccttgcttggaatactaggaacattgatgctacttcaactggtaattataatt 

aatataggtatcttagacatggcaagcttaggccctttgagagagacatatctaatgtgcctttctc 

gatggcaaaccttgcaccccacctgctcttaattgttattggccattaaatgattatggtttttacaccac 

TACTGGCATTGGCTACCAACCTTACAGAGTTGTAGTACTTTCTTTTGAACTTTTAAATGCACCGGCCACGG 

tttgtggaccaaaattatccactgaccttattaagaaccagtgtgtcaattttaattttaatggactcact 

GGTACTGGTGTGTTAACTCCTTCTTCAAAGAGATTTCAACCATTTCAACAATTTGGCCGTGATGTTTCTG^ 

tttcactgattccgttcgagatcctaaaacatctgaaatattagacatttcaccttgcgcttttgggggto 

TAAGTGTAATTACACCTGGAACAAATGCTTCATCTGAAGTTGCTGTTCTATATCAAGATGTTAACa^ 

gatgtttctacagcaattcatgcagatcaactcacaccagcttggcgcatatattctactggaaacaatgt 
attccagactcaagcaggctgtcttataggagctgagcatgtcgacacttcttatgagtgcgacattccta 
ttggagctggcatttgtgctagttaccatacagtttctttattacgtagtactagccaaaaatctattgtg 

GCTTATACTATGTCTTTAGGTGCTGATAGTTCAATTGCTTACTCTAATAACACCATTGCTATACCTACTAA 

cttttcaattagcattactacagaagtaatgcctgtttctatggctaaaacctccgtagattgtaatatgt 

ACATCTGCGGAGATTCTACTGAATGT6CTAATTTGCTTCTCCAATATGGTAGCTTTTGCACACAACTAAAT 

CGTGCACTCTCAGGTATTGCTGCTGAA»GGATCGCAACACACGTGAAGTGTTCGCTCAAGTCAA^ 

GTAO^AAACCCCAACTTTGAAATATTTTGGTGGTTTTAATTTTTCACAAATATTACCTGACCCTCTAAA^ 

caactaagaggtcttttattgaggacttgctctttaataaggtgacactcgctgatgctggcttcatgaag 

CAATATGGCGAATGCCTAGGTGATATTAATGCTAGAGATCTCATTTGTGCGCAGAAGTTCAATGGACTTAC 

agtgttgccacctctgctcactgatgatatgattgctgcctacactgctgctctagttagtggtactgcca 

CTGCTGGATGGACATTTGGTGCTGGCGCTGCTCTTCAAATACCTTTTGCTATGCAAATGGCATATAGGTTC 

aatggcattggagttacccaaaatgttctctatgagaaccaaaaacaaatcgccaaccaatttaacm 

GATTAGTCAAATTCAAGAATCACTTACAACAACATCAACTGCATTGGGCAAGCTGCAAGACGO^ 
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AGAATGCTCAAGCATTAT^CACACTTGTTAAACAACTTAGCTCTAATTTTGGTGCAAT 

AATGATATCCTTTCGCGACTTGATAAAGTCGAGGC6GAGGTACAAATTGACAGGTTAATTACAG 

TCAAAGCCTTCAAACCTATGTAACACAACAACTAATCAGGGCTGCTGAAATCAGGGCTTCTGCTAATCTTG 

CTGCTACTAAAATGTCTGAGTGTGTTCTTGGACAATCAAAAAGAGTTGACTTTTGTGGAAAGGGCTACCAC 

CTTATGTCCTTCCCACAAGCAGCCCCGCATGGTGTTGTCTTCCTACATGTCACGTATGTGCCATCCCAGGA 

GAGGAACTTCACCACAGCGCCAGCAATTTGTCATGAAGGCAAAGCATACTTCCCTCGTGAAGGTGTTTTTG 

TGTTTAATGGGACTTC^TGGTTTATTACACAGAGGAACTTCTTTTCTCCACAAATAATT^^ 

ACATTTGTCTCAGGAAATTGTGATGTCGTTATTGGCATCATTAACAACACAGT^ . 

TGAGCTTGACTCATTCAAAGAAGAGCTGGACAAGTACTTCAAAAATCATACATCACCAGATC 

GCGACATTTCAGGCATTAACGCTTCTGTCGTCAACATTCAAAAAGAAATTGACCGCCTCAATGAGGTCGCT 

AAAAATTTAAATGAATCACTCATTGACCTTCAAGAATTGGGAAAATATGAGCAATATATTAAATGGCCTTG 

GTATGTTTGGCTCGGCTTCATTGCTGGACTAATTGCCATCGTCATGGTTACAATCTTGCTTTGTTGCATGA . 

CTAGTTGTTGCAGTTGCCTCAAGGGTGCATGCTCTTGTGGTTCTTGCTGCAAGTTTGATGAGGATGACTCT 

GAGCCAGTTCTCTAGGGTGTCAAATTACATTACACATAAACGAACTTATGGATTTGTTO^^ 

ACTCTTAGATCAATTACTGCACAGCaVGTAAAAATTGACAATGCTTCTCCTGCAAGTACTGTTCAT^ 

AGCAACGATACCGCTACAAGCCTCACTCCCTTTCGGATGGCTTGTTATTGGCGTTGCATTTCTTGCTC 

TTCAGAGCGCTACCAAAATAATTGCGCTCAATAAAAGATGGCAGCTAGCCCTTTATAAGGGCTTCCAGTTC 

ATTTGCAATTTACTGCTGCTATTTGTTACCATCTATTCACATCTTTTGCTTGTCGCTGCAGGTATGGAGGC 

GCAATTTTTGTACCTCTATGCCTTGATATATTTTCTACAATGCATCAACGCATGTAGAATTATTATGAGAT 

GTTGGCTTTGTTGGAAGTGCAAATCCAAGAACCCATTACTTTATGATGCCAACTACTTTGTTTGCTGGCAC 

ACACATAACTATGACTACTGTATACCATATAACAGTGTCACAGATACAATTGTCGTTACTGAAGG 

CATTTOU^CACCAAAACTCAAAGAAGACTACCAAATTGGTGGTTATTCTGA^ 

AAGACTATGTCGTTGTACATGGCTATTTCACCGAAGTTTACTACCAGCTTGAGTCTACACAAATTACTAC^ 

GACACTGGTATTGAT^TGCTACATTCTTCATCTTTAACAAGCTTGTTAAAGACCCACCGAATGTGCAAA 

ACACACAATCGACGGCTCTTCAGGAGTTGCTAATCCAGCAATGGATCCAATTTATGATGAGCCGACGACGA 

ctactagcgtgcctttgtaagcacaagaaagtgagtacgaacttatgtactcattcgtttcggaagaaaca 

GGTACGTTAATAGTTAATAGCGTACTTCTTTTTCTTGCTTTCGTGGTATTCTTGCTAGTCACACTAGCCAT 

ccttactgcgcttcgattgtgtgcgtactgctgcaatattgttaacgtgagtttagtaaaaccaacggttt 
acgtctactcgcgtgttaaaaatctgaactcttctgaaggagttcctgatcttctggtctaaacgaactaa 

CTATTATTATTATTCTGTTTGGAACTTTAACATTGCTTATCATGGCAGACAACGGTACTATTACCGTT^ 

gagcttaaacaactcctggaacaaoxsgaacctagtaataggtttcctattcctagcctggattatgttact 

acaatttgcctattctaatcggaacaggtttttgtacataataaagcttgttttcctctggctcttgtggc 

CAGTAACACTTGCTTGTTTTGTGCTTGCTGCTGTCTACAGAATTAATTGGGTGACTGGCGGGATTGCGATT 

GCAATGGCTTGTATTGITAGGCTTGATGTGGCTTAGCTACTTCGTTGCTTCCTTCAGGCTGTTTGCTCGTAC 

CCGCTCAATGTGGTCATTCAACCCAGAAACAAACATTCTTCTCAATGTGCCTCTCCGGGGGACAATTGTGA 

CCAGACCGCTCATGGAAAGTfeAACTTGTCATTGGTGCTGTGATCATTCGTGGTCACT^ 

CACTCCCTAGGGCGGTGTGACATTAAGGACCTGCCAAAAGAGATCACTGTGGCTACATCACGAACGCTTTC 

TTATTACAAATTAGGAGCGTCGCAGCGTGTAGGCACTGATTCAGGTTTTGCTGCATACAACCGCTACCGTA 

TTGGAAACTATAAATTAAATACAGACCACGCCGGTAGCT^CGACAATATTGCTTTGCTAGTACAGTAAGTG 

ACAACAGATGTTTCATCTTGTTGACTTCCAGGTTACAATAGCAGAGATATTGATTATCATTATGAGGACTT 

TCAGGATTGCTATTTGGAATCTTGACGTTATAATAAGTTCAATAGTGAGACAATTATTTAAGCCTCTAACT 

AAGAAGAATTATTCGGAGTTAGATGATGAAGAACCTATGGAGTTAGATTATCCATAAAACGAACATGAAAA 

TTATTCTCTTCCTGACATTGATTGTATTTACATCTTGCGAGCTATATCACTATCAGGAGTGTGTTAGAGGT 

ACXSACTGTACTACTAAAAGAACCTTGCCCATCAGGAACATACGAGGGCAATTCACCATTTCACCCTC^^ 

TGACAATAAATTTGCACTAACTTGCACTAGCACACACTTTGCTTTTGCTTGTGCTGACGGTACTCGA 

CCTATCAGCTGCGTGCAAGATCAGTTTCACCAAAACTTTTCATCAGACAAGAGGAGGTTCAACAAGAGCTC 

TACTCGCCACTTTTTCTCATTGTTGCTGCTCTAGTATTTTTAATACTTTGCTTCACCATTAAGAGAAAGAC 

AGAATGAATGAGCTCACTTTAATTGACTTCTATTTGTGCTTTTTAGCCTTTCTCCTATTCCTTGTTT 

AATGCTTATTATATTTTGGTTTTCACTCGAAATCCAGGATCTAGAAGAACCTTGTACCAAAGTCTAAACGA 

ACATGAAACTTCTCATTGTTTTGACTTGTATTTCTCTATGCAGTTGCATATCCACTGTAGT^ 

GCATCTAATAAACCTCATGTGCTTGAAGATCCTTCTAAGGTACTU^CACTAGGGGTAATACTTATAGC^ 

CTTGGCTTTGTGCTCTAGGAAAGGTTTTACCTTTTCATAGATGGCACACTATGGTTCAAAC^ 

AATGTTACTATCAACTGTCAAGATCCAGCTGGTGGTGCGCTTATAGCTAGGTGTTGGTACCTTCATGAAGG 

TCACCAAACTGCTGCATTTAGAGACGTACTTGTTGTTTTAAATAAACGAACAAATTAAAATGTCTGATAAT 

GGACCCCAATCAAACCAACGTAGTGCCCCCCGCATTACATTTGGTGGACCCACAGATTCAACTGACAATAA 

CCAGAATGGAGGACGCAATGGGGCAAGGCCAi^AACAGCGCCGACCCCAAGGTTTACCCAATAATACTGCGT 

CTTGGTTCACAGCTCTCACTCAGCATGGCAAGGAGGAACTTAGATTCCCTCGAGGCCAGGGCGTTCCAATC 
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AACACXAATAGTGGTCCAGATGACCAAATTGGCTACTACCGAAGAGCTACCCGACGAGTTCGTGGTTO 

CGGCAAAATGAAAGAGCTCAGCCCCAGATGGTACTTCTATTACCTAGGAACOXSGCCCAGAAGCT^ 

CCTACGGCGCTAACAAAGAAGGCATCGTATGGGTTGCAACTGAGGGAGCCTTGAATACACCCAAAGACC^ 

ATTGGCACCCGCAATCCTAATAACAATGCTGCCACCGTGCTACAACTTCCTCAAGGAACAACATTGCCAAA 

AGGCTTCTACGCAGAGGGAAGCAGAGGCGGCAGTCAAGCCTCTTCTCGCTCCTCATCACGTAGTCGCGGTA 

ATTCAAGAAATTCAACTCCTGGCAGCAGTAGGGGAAATTCTCCTGCTCGAATGGCTAGCGGAGGTGGTGAA 

ACTGCCCTCGCGCTATTGCTGCTAGACAGATTGAACCAGCTTGAGAGCAAAGTTTCTGGTAAAGGCCA^ 

ACAACAAGGCCAAACTGTCACTAAGAAATCTGCTGCTGAGGCATCTAAAAAGCCTra 

CCACT^AAACAGTACAACGTCACTCAAGCATrrGGGAGACGTGGTCCAiSAACAAACCC 

GACCAAGACCTAATCAGACAAGGAACTGATTACAAACATTCGCCGCAAATTGCACAATTTG^ 

CTCTGCATTCTTTGGAATGTCACGCATOXSGCATGGAAGTCACACCTTCGGGAACATGGCTGACTTATCATC 

GAGCCATTAAATTGGATGACAAAGATCCACAATTCAAAGACAACGTCATACTGCTGAACAAGCACATTGAC 

GCATACAAAACATTCCCACCAACAGAGCCTAAAAAGGACAAAAAGAAAAAGACTGATOAAGCTCAGCCTTT 

GCCGCAGAGACAAAAGAAGCAGCCCACTGTGACTCTTCTTCCTGCGGCTGACATGGATGATa^^ 

AACTTCAAAATTCCATGAGTGGAGCTTCTGCTGATTCAACTCAGGCATAAACACTCATGAT^ 

GGCAGATGGGCTATGTAAACGTTTTCGCAATTCCGTTTACGATACATAGTCTACTCTTGTGCAGAAO^ 

TCTCGTAACTAAACAGCACAAGTAGGTTTAGTTAACTTTAATCTCACATAGCAATCTTTAATCAATGTG 

ACATTAGGGAGGACTTGAAAGAGCCACCACATTTTCATCGAGGCCACGCGGAGTACGATCGAGGGTACAGT 

GAATAATGCTAGGGAGAGCTGCCTATATGGAAGAGCCCTAATGTGTAAAATTAATTTTAGTAGTGCTATCC 

CCATGTGATTTTAATAGCTTCTTAGGAGAATGACAAAAAAAAAAAAAAAAAAAAAAAA 

GenBank Accession No. AY274119.2.; SEQ ID NO: 2 
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* 

ERV-2 

T0R2 ACACTCATGATGACCACACAAGGCAGATGGGCTATGTAAACGTTTTCGCAATTCCGTTTA 
AIBV 

ERV-2 p-' 

T0R2 CGATACATAGTCTACTCTTGTGCAGAATGAATTCTCGTAACTAAACAGCACAAGTAGGTT 

AIBV 

ERV-2 > ACCCGTTACCCTAAAATTCCCTCC ' 

T0R2 TAGTTAACOTTAATCTCACATAGCAATCTTTAATCAATGTGTAACATTAGGGAGGAC^^ 

AIBV TAGTTTA6TTTAAGTTAGTTTAG 

* * ** * 

« 

ERV-2 CCTTTCTCTTCAC TCGCCGAGGCCACGCCGAGTAGGACCGA(3GGTACAGC 

T0R2 AAA6AGCCACCACATTT— TC ATCGAGGCCACGCGGAGTACGATCGAGGGTACAGT 

AIBV AGTAGGTATAAAGATGCCAGTGCCGGGGCCACGCGGAGTACGATCGAGGGTACAGCACTA 

* *« ******** ***** ** *********** 

ERV-2 "6AGTCTTT-TAGTTTAAGGTGT-TAGATGTAAGGTACGTGGGCTTTCT--TTT6GTTO 

T0R2 -GAATAATGCTAGGGAGAGCTGCCTATAT6GAAGAGCCCTAAT6TGTAAAATTAATTTTA 

AIBV G6ACGCCCATTAGGGGAAGA-GCTAAATTTTA6TTTAAGTTAAGTTTAA TTG6CTAA 

** *** ****** * ** « ** 

ERV-2 CTTCTTC GenBank: AF361253 {SEQ ID NO: 31) 

T0R2 GTAGTGCTATCCCCATGTGATTTTAATAGCTTCTTAGGAGAATGAC (SEQ ID NO: 18) 

AIBV GTATAGTTAAAATTTATAGGCTAGTATAGAGTTAGAGCA GenBank: NC001451 (SEQ ID NO: 32) 

* 

I 
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MFIFLLFLTLTSGSDLDRCTTFDDVQAPNYTQHTSSMRGVYYPDEIFRSD 

TLYLTQDLFLPFYSNVTGFHTINHTFGNPVIPFKDGIYFAATEKSNWRG 

WVFGSTMNNKSQS VI I INNSTNWIRACNFELCDNPFFAVSKPMGTQTHT 

MIFDNAFNCTFEYISDAFSLDVSEKSGNFKHLREFVFKNKDGFLYVYKGY 

QPIDWRDLPSGFNTLKPIFKIiPLGINITNFRAILTAFSPAQDIWGTSAA . 

AYFVGYLKPTTFMLKYDENGTITDAVDCSQNPLAELKCSVKSFEIDKGIY 

QTSNFRWPSGDWRFPNiraLCPFGEVPNATKFPSVYAWERKKI SNCVA ' 

DYSVLYNSTFFSTFKCYGVSATKLNDLCFSNVYADSFVVKGDDVRQIAPG I 

QTGVIADYNYKLPDDFMGCVLAWNTRNIDATSTGNYNYKYRYLRHGK^ 

FERDI SNVPFS PDGKPCTPPALNCYWPLNDYGFYTTTGIGYQPYRVWLS • 

FELIiNAPATVCGPKLSTDLIKNQCVNFNFNGLTGTGVLTPSSKRFQPFQQ 

FGRDVSDFTDSVRDPKTSEILDISPCAF6GVSVITPGTNASSEVAVLYQD 

VNCTDVSTAIHADQLTPAWRIYSTGNNVFQTQAGCLIGAEHVDTSYECDI 

PIGAGICASYHTVSLLRSTSQKSIVAYTMSLGADSSIAYSNNTIAIPTNF 

SISITTEVMPVSMAKTSVDCNMYICGDSTECANLLLQYGS^^^ 

GIAAEQDRNTREVFAQVKQMYKTPTLKYFGGFNFSQILPDPLKPTKRSFI 

EDLLFNKVTLADAGFI^QYGECLGDINARDLICAQKFNGLTVLPPIiLTDD 

M1AAYT?JUjVSGTATAGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYE 

NQKQIANQFNKAISQIQESLTTTSTALGKLQDWNQNAQALNTLVKQIjSS 

NFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEI 

RASANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQAAPHGWFLHVTYV 

PSQEKNFTTAPAICHEGKAYFPREGVFVFNGTSWFITQRNFFSPQIITTD 

NTFVSGNCDWIGIINNTVYDPLQPELDSFKEELDKYFKiaHTSPDVDLGD 

ISGINASV\miQKEIDRLNEVAK3Sn^NESLIDLQELGKYEQYIKWPWYVWL 

GFIAGLIAIVMVTILLCCMTSCCSCLKGACSCGSCCKFDEDDSEPVLKGV 

KLHYT (SEQ ID NO: 33) 
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MADNGTITVEELKQLLEQWNLVIGFLFLAWIMLLQFAYSNRNRFLYIIKL 
VFLWLLWPVTLACFVLAAVYRINWTGGIAIAl^CIVGLl^^ 
LFARTRSl^SFNPETNILLNVPLRGTIVTRPLMESELVIGAVIIRGHLRM 
AGHSLGRCDIKDLPKEITVATSRTLSYYKLGASQRVGTDSGFAAYNRYRI 
GNYKLNTDHAGSNDNIALLV (SEQ ID NO: 34) 
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I^SFVSEETGTLIVNSVLLFLAFWFLLVTLAILTALRLCAYCCNIVIWS 
LVKPTVYVYSRVKNLNSSEGVPDLLV (SEQ ID NO: 35) 
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MSDNGPQSNQRSAPRITFGGPTDSTDNNQNGGRNGARPKQRRPQGIPIW^ 

ASWFTALTQHGKEELRFPRGQGVPINTNSGPDDQIGYYRRATRRVRGGDG 

KMKELSPRWYFYYLGTGPEASLPYGANKEGIVWVATEGALNTPKDfflGTR 

NPNNNAATVLQLPQGTTLPKGFYAEGSRGGSQASSRSSSRSRGNSRNSl? 

GSSRGNSPARMASGGGEtALALLLLDRLNQLESKVSGKGQQQQGQTVTKK 

SAAEASKKPRQKRTATKQYNVTQAFGRRGPEQTQGNFGDQDLIRQGTDYK 

HWPQIAQFAPSASAFFGMSRIGMEVTPSGTWLTYHGAIKLDDKDPQFKDN 

VILLNKHmAYKTFPPTEPKKDKKKKTDEAQPIJPQRQ 

MDDFSRQLQNSMSGASADSTQA (SEQ ID NO: 36) 
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BoCov MSSVTTPAP--VYTWTADEAIKPLKEWNFSL 

OC43 MSSKTTPAP--VyiWTADEAIKFLKEWNFSL 

PHEV — MSSPTTPVP— VISWTADEAIKFLKEWNFSL 

FCV MKILLILACAVACVYGEQIRYCAMQ-ETGLSCRNGTASKESCFNGGDLIWHIAN^^ 

TGEV MKILLII^CVIACACGE--RyCAMKSDTDLSCRNSTASIX:ESCFNGGDLIWHIJiN^ 

T0R2_M MAD— NGTITVEELKQLLEQWNLVI 

0RP5^ MAD— NGTITVEELKQLLEQWNIiVI 

A1BV2 MMEN CTLNLEQATLLFKEYNLPI 

AlBV MSNGTEM CTLSTQQAAELFKEYNLPI 

• • • • 4 « 

I 

BoCoV GIIXiLFITVILQFGYTSRSMF\maKMVILWLMWPLTIILTimCV--YALNN-\^^ 

OC43 GIILLFITIILQFGYTSRSMFVWIKMIILWLMWPLTIILTIFNCV--YALNN-VYLGI1S 

PHBV GIIVLFITIILQFGYTSRSMFVYVIKMVILWLMWPLTI ILTIFNCV- - YAIJ3N-VyLGFS 
FCV . SIILIVFITVnLQYGRPQFSWFVYGIKMLIMWLLWPIVLALTIFNAYSEYEVSRYVMFGFS 

TGEV SlILIVFITVLQYGRPQFSWFVYGIKMLIMWLLWPVVLALTIFNAySEYQVSRYVMFGPS 

T0R2„M GFLFLAWIMLIiQFAYSNRNRFLYIIKLWLWIJJWPVTLACF\7IiAAV- - YRINW-VTGGIA 

0RF5 GFLFIAWIMLLQFAYSNRNRFLYIIKLVFLWLLWPVTLACFVLAAV- - YRINW-VTG6IA 

AIBV2 TAFLLFLTIIiLQyGYATRSRFIYILKMrV^WCFWPIJffIAVGVISCI--yPPOT-G6L^ 

AlBV TAPLLFLTILLQYGYATRSRFlYIIiKMIVIAfrcFliWtNIAV6IISCI--YP^ 

• • • • ■* • • 

BoCov IVFTlVAriMWIVYPVNSIRLPIRTGSWWSEWPETNNLMCIDMK-GRMYVRPlIBDYHTL 

0C4 3 IVFTI VAIIMWIVYFVNSIRLFIRTGSFWSFNPETNNLMCIDMK-GTMYVRPl lEDYHTL 

PHEV IVFTIVAII^flVVVYFVNSIRLFIRTGSWWSFNPETN^MCI^MK-GRMYVRPIIEDYHT 

FCV VAGAVVTFALWMMYFVRSIQLYRRTKSWWSFNPETNAILCVNAL-GRSYVliPLDGTPaXSV 

TGEV I AGAIVTFVLWIMYFVRS IQXiYRRTKSWWSFNPETKAIIiCVSAL-GRSYVLPLEGVPTGV 

T0R2_JI lAMACIVGLMWIiSYFVASFRLFARTRSMWSFNPETNILI^PLR-GTrVTRPLMESELVI 

0RF5 lAMACIVGLMWLSYFVASFRLFARTRSMWSFNPETNILLNVPLR-GTIVTRPLMESELVI 

AIBV2 IILTVFACLSPVGYWIQSCRLFKRCRSWWSFNPESNAVGSILLTNGQQCNFAIESVPMVL 

AlBV IILTVFACLSFVGYVOQSPRLFKRCRSWWSFKnPESNAVGSlLLTNGQQCNFAIESVPMV^ 

BOCOV TVTIIRGHLYMQGIKLGTGySI.SDLPAyVTVAKVSHLLTYKR GFLDKIGDTSGPAVY 

OC43 TVTIIRGHLYIQGIKLGTGYSWADLPAYMTVAKVTHLCTYKR GFLDRISDTSGPAVY 

PHEV TATIIRGHLYIQGIKLGTGYSLSDLPAYVTVAKVTHLCTYKR GFLDRIGDTSGPAVY 

FCV TLTLLSGNLYAEGFKMAGGLTIEHLPKYVMIRTFNRTIVYTLV— GKQLKATTATGWAYY 

TGEV TLTIiLSGNLYi^GFKIAGGMNIDNLPKYVMVALPSRTIVYTLV- -GKKLKASSATGWAYY 

T0R2 GAVIIRGHLRMAGHSLGR-CDIKDLPKEITVAT-SRTLSYYKL- -GASQRVGTDSGPAAY 

0RF5 GAVI IRGHLRMAGHSLGR-CDIKDLPKEITVAT- SRTLSYYKL— GASQRVGTDSGPAAY 

AIBV2 APIIKNGVLYCEGQWLAK-CEPDHLPKDIFVCTPDRRNIYRMVQKYTGDQSGNKKRVATF. 

AI6V SPIIloeGALYCEGQWIjaC-CEPDHLPKDXFVCTFDRRNIYRMVQKyTGDQSGNKKRPATP 

. * * * . ** .* ■ ^ it m 

* ••• • ••■ 

BoCov VKSfCVGNYRIiPSTQKGSGLDTALIiRNNr 

OC43 VKSKVGNYRLPSTQKGSGMDTALIiRNNI 

PHEV VKSKVGNYRLPSTHKGSGMDTALLRNNI 

FCV VKSKAGDYSTEARTDNLSEHEKLLHMV- 

TGEV VKSXAGDYSTEARTDNLSEQEKLLHMV- 

T0R2_M NRYRIGNYKIiNTDHAGSNDNIAIiLVQ — 

0RF5 NRYRIGNYKLNTDHAGSNDNIALLVQ — 

AIBV2 VYAKQSVDTGEI.ESVPTGGSSLYT 

AlBV VYAKQSVDTGEIiGSVATGGSSLYT 



Key Name Genbank %ID 

PHEV Porcine hemagglutinating encephalomyelitis virus AAL80035 40.4% (SBQ ID NO: 37) 

BoCov matrix protein [Bovine coronavirus] . NP_150082 40.0% (SEQ ID NO: 38) 

AlBV membrane protein [Avian infectious bronchitis virus]. AAF35e63 31.3% (SBQ ID NO: 39) 

TGEV membrane protein [Transmissible gastroenteritis virus]. NF_058427 28. S% (SEQ ID NO: 40) 

FCV membrane [feline coronavirus] . BAC01160 27.7% (SEQ ID NO: 41) 

OC43 membrane glycoprotein [Human coronavirus OC43]. AAA45462 39.1% (SEQ ID NO: 42) 

AIBV2 membrane protein [Avian infectious bronchitis virus]. AAK83027 32.0% (SEQ ID NO: 43) 
T0R2_M/0RF 5 Sars associated coronavirus M glycoprotein (SEQ ID NO: 34) 
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BoCov MSFTPGKQSS- SRASSGNRSGNGILK WADQSDQSRNVQTRGKRAQP- -KQTATSQQP 

0C43 MSFTPGKQSS- SRASSGNRSGNGILK WADQSDQVRNVQTRGRBAQP — KQTATSQQP 

PHEV MSFTPGKQS S- SRAS SGNRSGNGILK WADQSDQSRNVQTRGRRVQS— KQTATSCXJ? 

MHV MSFVPGQENAGSRSSSVNRAGNGIIiKKTTWADQTERGPNNQNRGRRNQP — KQTATTQ-P 

AIBV2 ' MASGKAAGK TDAPAPVIK LGGPKPP— KVGSSGN— 

MASGKATGK TDAPAPIIK liGGPKPP — KVGSSOI— 

AIBV MASGKAAGK TDAPAPVIK LGGPKPP — KVGSSGN — 

FCV -77.7 MATQGQRVN WGDEPSKRR GRSNSR — GRKNNDIP- 

PTCV '• MANQGQRVS WGDESTKTR GRSNSR — GRKNNNIP- 

229E MATVK WADASEPQR GRQ GRIPYSL — 

T0R2^ HSDNGPQSNQRSAPRITFGGPTDSTDNNQN6GRNGARPKQRRPQGWN 

* 



BoCoV 

OC43 

PHEV 

MHV 

AIBV2 

TCV 

AIBV 

FCV 

PTGV 

229E 

T0R2J!I 



SGGNWPYYSWFSGITQFQKGKEFEFAEGQGVPIAPGVPATEAKGYWyRHNRRSFKTADG 
SGGNWPyySWFSGITQFOKGKEFEFVEGQGPPIAPGVPATEAKGVWYRHNRGSFKTADG 
SGGTVVPYYSWFSGITQFQKGKEFEFAEGQGVPIAPGVPSTEAKGYWYRHNRRSFKTADG 
NSGSWPHYSWFSGITQFQKGKEFQFAQGQGVPIANGIPASEQKGYWYRHNRRSFKTPDG 

AS _---WFQAIKAKKLin'PPPKFEGSGVPimNIKPSQQHGVWRRiQAR--FKPGKG 

AS - WFQSIKAKKLNSPQPKFEGSGVPI»IENIKTSQQHGYWRRQAR— FKPOKG 

AS WFQALKAKKLNAPAPKFEGSGVPDNENIiKISQQHGyWRRQAR- -YKF^aCG 

M yFNPITLIXy3SKFWNLCPRDFVPKGlGNK-DQQlGYWNRQAR--yRIVK!Q 

LS FFNPITLQQGSKFWNLCPRDFVPKGIGMR-DQQIGYWNRQTR — YRMVKQ 

-y- SPXiLVDS- EQPWKVIPRNIiVPINKKDK-NKLIGYWNVQKR-- FRTRKG 

WFTALTQHG-KEELRFPRGQGVPIMTNSGPDDQIGYyRRATRR-VRGGDG 



BoCov 

OC43 

PHEV 

MHV 

AIBV2 

TCV 

AIBV 

FCV 

PTGV 

22 9B 

T0R2J!I 



NQRQLLPRWYFYYLGTGPHAKDQYGTDIDGVYWVASNQADVNTPADILDRDPSSDEAIPT 

NQRQLLPRWYFYYLGTGPHAKDQYGTDIDGVyWVASNQ7a)VNTPADIVDRDPSSDEAlPT 

KQRQLLPRWYFYYLGTGPHAKDQYGTDIDGVFWVASNQADINTPADIVDRDPSSDEAIPT 

QQKQLLPRWYFYYLGTGPHAGAEYGDDlDGVVWASQQAiyrKTTADIVERDPSSHEAlPT 

GRKPVPDAWYFYYTGTGPAADLimGITrQIXSIVWVAAKGADTKSXlSNQGTRDPDKFI^ 

GRKPVPDAWYFYYTGTGPi^IJWGDTQDGrVWVAAKGADVKSRSNQGTRDPDKFDQYPL 

GRKPVPDAWYFYYTGTGPAADIJ^WGDSQIX3rVWVAAKGADVKSRSNQGTRDPDKFDQYPL 

QRVELPERWFFYFIf6TGPHADAKFKAKII)GVFWVAKDGAMN-KPTSIjGTRG-TNNESKPL 

QRKELPERWFFYYLGTGPHADAKFKDKLDGVVVWAKDGAMN-KPTTUSSRG-ANNESKAL 

KRVDLSPKLHFyYLGTGPHKI)AKFRERVEGVVWVAVDGAKT-EPTGYGVRR--KNSEPEIP 

KMKELSPRWYFYYLGTGPEASLFYGAI^GIVWVATEGALNTPKDHIGTRNPMNIUUIT^ 



**** 



*** 



BoCov 

0C43 

PHEV 

MHV 

AIBV2 

TCV 

AIBV 

FCV 

PTGV 

229E 

T0R2ji 



RFPPGTVLPQGYYIEGS - GRSAPNSRSTSRASSRASSA GSRSRANSGNR TPTSG 

RFPPGTVLPQGYYIEGS-GRSAPNSRSTSRTSSRASSA GSRSRANSGNR TPTSG 

RFPP^TVLPQGYYIBGS-GRSAPNSRSTSRAIWRAPSA GSRSRANSGNR TSTPG. 

RFAP6TVLPQGFYVEGS-GRSAPASRS6SRSQSRGP NNRARSSSNQR QPAST 

RFSDG— GPOGNFRWDF-IPLKNRGRSG-RSTAASSAA ASRAPSRE6SR 6RRSD 

RFSDG— GPDSNFRWDF-IPLH-RGRSG-RSTAASSAA SSRAPSRDGSR GRRSG 

RFSDG — GPDGNFRWDF-IPLN-RGRSG-RSTAASSAA SSRAPSREGSR GRIJ^JO 

KFDGK-IPPQFQLEVNR- SRNKSRSGSQSRSVSRNRS QSRGRQQSNNQ- -NTNVBD 

KFDGK-VPGEFQLEVNQ-SRDNSRLRSQSRSRSRNRS QSRGRQQSNNKK-DDSVEQ 

HFNQK — LPNGVTWEE-PDSRAPSRSQSRSQSRGRGESKPQSRNPSSDRNHNSQDDIMK 
QLPQGTTLPKGFYAEG SRGGSQAS SRSSSRSRGNSRNSTPGSSRGNSPARMAS -GGGETA 



BoCov 

0C43 

PHEV 

MHV 

AIBV2 

TCV 

AIBV 

FCV 

PTGV 

229B 

T0R2_N 



VTPDMADQIASLVLAKLGKDAAKP QQVTKQTAKEIRQK— IL 

VTPDMADQIASLVLAKLGKDATKP QQVTKHTAKEVRQK— IL 

VTPDMADQIASLVLAKLGKDATKP QQVTKQTAKEVRQK— Hi 

VKPDMAEEIAALVliAKLGKDAGQP KQVTKQSAKEVRQK— IL 

SGDDIilARAAKIlQDQQKKGS RITKAKADEMAHR — RY 

SEDDLIARAAKIIQDQQKKGS RITKAKADEMAHR-RY 

AEDDLIARAAKIIQDQQKKGS RITKAKAEEMIHR— RY 

TIVAVLQKLGVTDK QRSRSKS GERSQSKSRDTTPK— NA 

AVLAALKKLGVYTEKQQQRSRSKS KERSNSKIRDTTPK — NB 

AVAAALKSLGFDKPQEKDKKSAKTGTPKPSRNQSPASSQTSAKSLARSQSSETKEQKHEM 

IiALLIiLDRLNQLESKVSGKGQQQQG QTVTKKSAAEASKK — PR 

• • • • • 
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NKPRQKRSFNKQCT- -VQQCF6KR GPNQNFGGGEHLKLGTSDFQFPIIiAELAFTAGA 

NXFRQXRSPNKQCT- -VQQCF6XR-— GPNQNFGGGEMLKLGTSpPQFPILAELAPTAGA 

NKPRQKRSPNKQCT — VQQCFGKR QPNQNFGGGEMLKLGTSDPQFPILAELAPTAGA 

NKPRQKRTPNKQCP--VQQCFGKR GPNQNFGGSEMLKLGTSDPQFPIIAELAPTPSA 

CK RTIPPNYR — VDQVFGPRT-KGKEGNFGDDKMNEEGIKDGRVTAMLNLVPSSHA 

CK RTVPPGYK- -VDQVPGPRT-KGKEGNFGDDKMNEEGIKDGRVTAMLNLVPSSHA 

CK RTVPPGVS- - IDKVFGPRT-KGKEGNFGDDKMNEEGIKDGRVTAMLNLVtSSHA 

NKHTWKKTAGKGD VTNFYGAR SSSANFGDSDLVANGNAAKCYPQIAECVPSVSS 

NKHTWKRTAGKGD- — VTRFYGTR SNSANFGDSDLVANGSSAKHYPQIiAECVl'SVSS 

QKPRWKRQPMDDVTSNVTQCPGPR DLDHNFGSAGVVANGVKAKGYPQFABLVPBTAA 

QK RTATKQyN--VTQAFGRRGPEQTQGNFGDQDLZRQGTDyKHWPQIAQFA£%ASA 
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FFFGSRLELAKVQNLSGNLDEPQKDVYEIiRyNGAlR™ - -FDSTLSGPETIMKVLNENL 

FFFGSRLELAKVQNLSGNPDEPQKDVYELRYNGAIR FDSTLSGFETIMKVLNENIi 

FFFGSRLELAKVQNLSGNPDEPQKDVYELRYNGAIR FDSTLSGFETIMKVLNQNIi 

FFFGSKLELVKKN- - SGGADDPTKDVYELQYSGAIR FDSTLPGFETIMKVLNENIi 

CLFGSRVTPKLQL — DGLHLRFEFTTVVPCDDPQFDNYVKICDQCVDGVGTRPKDDEPKP 
CLFGSRVTPKIjQP--IX3LHLRFEFTTVVPRDDPQFDNYVTICDQCVDGIGTRPKDNEPRP 
CLFGSQVTPKLQP — DGLHLTFRFTTWSRDDPQFDNYVKICDECVDGVGTRPKDEWRP 

ILFGSQWSAEEAG—DQVKVTIiTHNYYLPKDDAKTS QPLBQI 

ILFGSYWTSKEDG— DQIEVTFTHKYHLPKDDPKTG QFLQQI 

MLFDSHIVSKESG — NTWLTFTTRVTVPKDHPHLG KFI£EL 

FFGMSR16MEVTP— SGTWLTYHGAIKLDDKDPQPK DN VILLNKK 
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KAYQQQ-DGTMNMSPKPQRQRG QKNGQGENDNISVAAPKSRVQQNKIRELTAEDIS 

NAYQQQ-DGMMNMSPKPQRQRG HKNGQGENDNISVAVPKSRVQQNKSRELTAEDIS 

NAYQHQEDGMMNISPKPQRQRG QKNGQVENBNVSVAAPKSRVQQNKSRELTAEDIS 

DAYQDQAGGADWSPKPQRKRGT — KQKALKGEVDNVSVAKPKSSVQRNVSRELTPEDRS 
KSRSSSRPATRGNSPAPRQQRPK--KEKKLKKQDDEADKALTSDEERNNAQLEFYDEP-K 
KSRPSSRPATRGNSPAPRQQRPK--KEKKPKKQDDEVDKALTSDEERNNAQLEFDDEP-K 
KSRSSSRPATRGTSPAPKQQRPK— KEKKPKKQDDEVDKALTSDEERHNAQLBFDDEP-K 

DAYKRP SEVAKDQRQ RKSRSKSADKKPEEIiS — VTLEAYTDVFDDTQVE 

NAYARP SEVAKEQRK RKSRSKSAERSEQEWPDALIENYTDVFDDTQVE 

NAFTRE MQQHP LLNPSALEFNPSQTSPATAEPVRDEVSIET-D 

DAYKTFPP TEPK]U)KKKKTDEAQPIiPQRQXKQPTVTLLPAADHDDFSRQIjQNSHSG 

• : i : ... 

LLKKMDEP FTEDTSEI 

LLKKMDEP YTEDTSEI 

LLKKMDEP YTEDTSEI 

LLAQILDDGWPIXJLEDDSNV 

VINWGDAA LGENEIi — 

VINWGDSA LOENHIi— 

VINWQDSA- LGENEL-- 

MIDEVTN 

MIDEVTO 

IIDEVN 

ASADSTQA 
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nucleoprotein [porcine transmissible gastroenteritis virus] . 
nucleocapsid protein [Huiun coronavirus 229B] . 
NUCLEOCAPSID PROTEIN. 

nucleocapsid protein [porcine hemagglutinating encephaloavelitis] 

nucleocapsid protein [turkey coronavirus) . 

SARS associated virus nucleocapsid protein (SEQ ID NO: 36) 
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ATATTAGGTTTTTACCTACCCAGGAAAAGCCAACCAACCTCGATCTCTTG 

TAGATCTGTTCTCTAAACGAACTTTAAAATCTGTGTAGCTGTCGCTCGGC 

TGCATGCCTAGTGCACCTACGCAGTATAAACAATAATAAATTTTACTGTC 

GTTGACAAGAAACGAGTAACTCGTCCCTCTTCTGCAGACTGCTTACGGTT 

TCGTCCGTGTTGCAGTCGATCATCAGCATACCTAGGTTTCGTCCGGGTGT 

GACCGAAAGGTAAGATGGAGAGCCTT6TTCTTGGTGTCAACGAGAAAACA 

CACGTCCAACTCAGTT4^CTGTCCTTCAGGTTAGAGACGTGCTAGTGCG 

TG6CTTCGGGGACTCTGTGGAAGAGGCCCTATCGGAGGCAC6TGAACACC 

TCAAAAATGGCACTTGTGGTCTAGTAGAGCTGGAAAAAGGCGTACtGCCC 

CAGCTTGAACAGCCCTATGTGTTCATTAAACGTTCTGATGCCTTAAGCAC 

CAATCACGGCCACAAGGTCGTTGAGCTGGTTGCAGAAATGGACGGCATTC 

AGTACGGTCGTAGCGGTATAACACTGGGAGTACTCGTGCCACATGTGGGC 

GAAACCCCAATTGCATACOSCAATGTTCTTCTTCGTAAGAACGGTAATAA 

GGGAGCCGGTGGTCATAGCTATGGCATCGATCTAAAGTCTTATGACTTAG 

GTGACGAGCTTGGCACTGATCCCATTGAAGATTATGAACAAAACTGGAAC 

ACTAAGCATGGCAGTGGTGCACTCCGTGAACTCACTCGTGAGCTCAATGG 

AGGTGCAGTCACTCGCTATGTCGACAACAATTTCTGTGGCCCAGATGGGT 

ACCCTCTTGATTGCATCAAAGATTTTCTCGCACGCGCGGGCAAGTCAATG 

TGCACTCTTTCCGAACAACTTGATTACATCGAGTCGAAGAGAGGTGTCTA 

CTGCTGCCGTGACCATGAGCATGAAATTGCCTGGTTCACTGAGCGCTCTG 

ATAAGAGCTACGAGCACCAGACACCCTTCGAAATTAAGAGTGCCAAGAAA 

TTTGACACTTTCAAAGdGGAA^lWCCAAAGTTTGTGTTTCCTC 

AAAAGTCAT^GTCATTCAACCACGTGTTGAAAAGAAAAAGACTGAGGGTT 

TCATGGGGCGTATACGCTCTGTGTACCCTGTTGCATCTCCACAGGAGTGT 

AACAATATGCACTTGTCTACCTTGATGAAATGTAATCATTGCGATGAAGT 

TTCATGGCAGACGTGCGACTTTCTGAAAGCCACTTGTGAACATTGTGGCA 

CTGAAAATTTAGTTATTGAAGGACCTACTACATGTGG6TACCTACCTACT 

AATGCTGTAGTGAAAATGCCATGTCCTGCCTGTCAAGACCCAGAGATTGG 

ACCTGAGCATAGTGTTGCAGATTATCACAACCACTCAAACATTGAAACTC 

GACTCCGCAAGGGAGGTAGGACTAGATGTTTTGGAGGCTGTGTGTTTGCC 

TATGTTGGCTGCTATAATAAGCGTGCCTACTGGGTTCCTCGTGCTAGTGC 

TGATATTGGCTCAGGCCATACTGGCATTACTGGTGACAATGTGGAGACCT 

TGAATGAGGATCTCCTTGAGATACTGAGTCGTGAACGTGTTAACATTAAC 

ATTGTTGGCGATTTTCATTTGAATGAAGAGGTTGCCATCATTTTGGCATC 

TTTCTCTGCTTCTACAAGTGCCTTTATTGACACTATAAAGAGTCTTGATT 

ACAAGTCTTTCAAAACCATTGTTGAGTCCTGCGGTAACTATAAAGTTACC 

AAGGGAAAGCCCGTAAAAGGTGCTTGGAACATTGGACAACAGAGATCAGT 

TTTAACACCACTGTGTGGTTl"rCCCTCACAGGCTGCTGGTGTTATCAGAT 

CAATTTTTGCGCGCACACTTGATGCAGCAAACCACTCAATTCCTGATTTG 

CAAAGAGCAGCTGTCACCATACTTGATGGTATTTCTGAACAGTCATTACG 

TCTTGTCGACGCCATGGTTTATACTTCAGACCTGCTCACCAACAGTGTCA 

TTATTATGGCATATGTAACTGGTGGTCTTGTACAACAGACTTCTCAGTGG 

TTGTCTAATCTTTTGGGCACTACTGTTGAAAAACTCAGGCCTATCTTTGA 

ATGGATTGAGGCGAAACTTAGTGCAGGAGTTGAATTTC^AAGGATGCTT 

GGGAGATTCTCAAATTTCTCATTACAGGTGTTTTTGACATCGTCAAGGGT 

CAAATACAGGTTGCTTCAGATAACATCAAGGATTGTGTAAAATGCTTCAT 

TGATGTTGTTAACAAGGCACTCGAAATGTGCATTGATCAAGTCACTATCG 

CTGGCGCAAAGTTGCGATCACTCAACTTAGGTGAAGTCTTCATCGCTCAA 

AGCAAGGGACTTTACCGTCAGTGTATACGTGGCAAGGAGCAGCTGCAACr 

ACTCATGCCTCTTAAGGCACCAAAAGAAGTAACCTTTCTTGAAGGT^ 

CACATGACACAGTACTTACCTCTGAGGAGGoyrGTTCTCAAGAACGGTGAA 

CTCGAAGCACTCGAGACGCCCGTTGATAGCTTCACAAATGGAGCTATCGT 

TGGCACACCAGTCTGTGTAAATGGCCTCATGCTCTTAGAGATTAAGGACA 

AAGAACAATACTGCGCATTGTCTCCTGGTTTACTGGCTACAAACAATGTC 

TTTCGCTTAAAAGGGGGTGCACCAATTAAAGGTGTAACCTTTGGAGAAGA 

TACTGTTTGGGAAGTTCAAGGTTACAAGAATGTGAGAATCACATTTGAGC 

TTGATGAACGTGTTGACAAAGTGCTTAATGAAAAGTGCTCTGTCTACACT 
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GTTGAATCCGGTACCGAAGTTACTGAGTTTGCATGTGTTGTAGCAGAGGC 

TGTTGTGAAGACTTTACAACCAGTTTCT6ATCTCCTTACCAACATGGGTA 

TTGATCTTGATGAGTGGAGTGTAGCTACATTCTACTTATTTGATGATGCT 

GGTGAAGAAAACTTTTCATCACGTATGTATTGTTCCTTTTACCCTCCAGA 

TGAGGAAGAAGAGGACGATGCAGAGTGTGAGGAAGAAGAAATTGATGAAA 

CCTGTGAACATGAGTACGGTACAGAGGATGATTATCAAGGTCTCCCTCTG 

GAATTTGGTGCCTCAGCTGAAACAGTTCGAGTTGAGGAAGAAGAAGAGGA 

AGACTGGCTGGATGATACTACTGAGCAATCAGAGATTGAGCCAGAACCAG 

AACCTACACCTGAAGAACCAGTTAATCAGTTTACTGGTTATTOAAAAC^ 

ACTGACAATGTTGCCATTAAATGTGTTGACATCGTTAAGGAGGCACAAAG' 

TGCTAATCCTATGGTGATTGTAAATGCTGCTAACATACACCTGAAACATG 

GTGGTGGTGTAGCAGGTGCACTCT^CAAGGCAACCAATGGTGCCATGCAA 

AAGGAGAGTGATGATTACATTAAGCTAAATGGCCCTCTTACAGTAGGAGG 

GTCTTGTTTGCTTTCTGGACATAATCTTGCTAAGAAGTGTCTGCATGTTC 

TTGGACCTAACCTA2^TGCAGGTGAGGACATCCAGCTTCTTAAGGai^^ 

TATGAAAATTTCAATTCACAGGACATCTTACTTGCACCATTGTTGTCA^ 

AGGCATATTTGGTGCTAAACCACTTCAGTCTTTACAAGTGTGCGTGCAGA 

CGGTTCGTACACAGGTTTATATTGCAGTCAATGACAAAGCTCTTTATGAG 

CAGGTTGTCATGGATTATCTTGATAACCTGAAGCCTAGAGTGGAAGCACC 

TAAACAAGAGGAGCCACCAAACACAGAAGATTCCAAAACTGAGGAGAAAT 

CTGTCGTACAGAAGCCTGTCGATGTGAAGCCAAAAATTAAGGCCTGCATT 

GATGAGGTTACCACAACACTGGAAGAAACTAAGTTTCTTACCAATAAGTT 

ACTCTTGTTTGCTGATATCAATGGTAAGCTTTACCATGATTCTCAGAACA 

TGCTTAGAGGTGAAGATATGTCTTTCCTTGAGAAGGATGCACCTTACATG 

GTAGGTGATGTTATCACTAGTGGTGATATCACTTGTGTTGTAATACCCTC 

CAATU^GGCTGGTGGCACTACTGAGATGCTCTCAAGAGCTTTGAAGAAAG 

TGCCAGTTGATGAGTATATAACCACGTACCCTGGACAAGGATGTGCTGGT 

TATACACTTGAGGAAGCTAAGACTGCTCTTAAGAAATGCAAATCTGCATT 

TTATGTACTACCTTCAGAAGCACCTAATGCTAAGGAAGAGATTCTAGGAA 

CTGTATCCTGGAATTTGAGAGAAATGCTTGCTCATGCTGAAGAGACAAGA 

AAATTAATGCCTATATGCATGGATGTTAGAGCCATAATGGCAACCATCCA 

ACGTAAGTATAAAGGAATTAAAATTCAAGAGGGCATCGTTGACTATGGTG 

TCCGATTCTTCTTTTATACTAGTAAAGAGCCTGTAGCTTCTATTATTACG 

AAGCTGAACTCTCTAAATGAGCCGCTTGTCACAATGCCAATTGGTTATGT 

GACACATGGTTTTAATCTTGAAGAGGCTGCGCGCTGTATGCGTTCTCTTA 

AAGCTCCTGCCGTAGTGTCAGTATCATCACCAGATGCTGTTACTACATAT 

AAT6GATACCTCACTTCGTCATCAAAGACATCTGAGGAGCACTTTGTAQA 

AACAGTTTCTTTGGCTGGCTCTTACAGAGATTGGTCCTATTCAGGACAGC 

GTACAGAGTTAGGTGTTGAATTTCTTAAGCGTGGTGACAAAATTGTGTAC 

CACACTCTGGAGAGCCCCGTCGAGTTTCATCTTGACGGTGAGGTTCTTTC 

ACTTGACAAACTAAAGAGTCTCTTATCCCTGCGGGAGGTTAAGACTATAA 

AAGTGTTCACAACTGTGGACAACACTAATCTCCACACACAGCTTGTGGAT 

ATGTCTATGACATATGGACAGCAGTTT66TCCAACATACTTGGATGGTGC 

TGATGTTACAAAAATTAAACCTCATGTAAATCATGAGGGTAAGACTTTCT 

TTGTACTACCTAGTGATGACACACTACGTAGTGAAGCTTTCGAGTACTAC 

CATACTCTTGATGAGAGTTTTCTTGGTAGGTACATGTCTGCTTTAAACCA 

CACAAAGAAATGGT^TTTCCTCAAGTTGGTGGTTTAACTTCAATTAAAT 

GGGCTGATAACAATTGTTATTTGTCTAGTGTTTTATTAGCACTTCAACAG 

CTTGAAGTCAAATTCAATGCACCAGCACTTCAAGAGGCTTATTATAGAGC 

CCGTGCTGGTGATGCTGCTAACTTTTGTGCACTCATACTCGCTTACAGTA 

ATAAAACTGTTGGCGAGCTTGGTGATGTCAGAGAAACTATGACCCATCTT 

CTACAGCATGCTAATTTGGAATCTGCAAAGCGAGTTCTTAATGTGGTGTG 

TAAACATTGTGGTCAGAAAACTACTACCTTAACGGGTGTAGAAGCTGTGA 

TGTATATGGGTACTCTATCTTATGATAATCTTAAGACAGGTGTTTCCATT 

CCATGTGTGTGTGGTCGTGATGCTACACAATATCTAGTACAACAAGAGTC 

TTCTTTTGTTATGATGTCTGCACCACCTGCTGAGTATAAATTACAGCAAG 

GTACATTCTTATGTGCGAATGAGTACACTGGTAACTATCAGTGT6GTCAT 
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TACACTCATATAACTGCTAAGGAGACCCTCTATCGTATTGACGGAGCTCA 
CCTTACAAAGATGTCAGAGTACAAAGGACCAGTGACTGATGTTTTCTACA 
AGGAAACATCTTACACTACAACCATCAAGCCTGTGTCGTATAAACTCGAT 
GGAGTTACTTACACAGAGATTGAXCCAAAATTGGATGGGTATTATAAAAA 
GGATAATGCTTACTATACAGAGCAGCCTATAGACCTTGTACCAACTCAAC 

cattaccaaatgcgagttttgataatttcaaactcacatgttctaacaca 

aaatttgctgatgattlfeaatcaaatgacaggcttcacaaagccagcttc 

acgagagctatctgtcacattcttcccagacttgaatggcgatgtagtgg 

ctattgactatagacactattcagcgagtttcaagaaaggtgctaaatta 

ctgcataagccaattgtttggcacattaaccaggctacaaccaagacaac 

gttcj^aaccaaacacttggtgtttacgttgtctttggagtacaaagccag 

tagatacttcaaattcatttgaagttctggcagtagaagacacacaagga 

atggacaatcttgcttgtgaaagtcaacaacccacctctgaagaagtagt 

ggaaaatcctaccatacagaaggaagtcatagagtgtgacgtgaaaacta . 

ccgaagttgtaggcaatgtcatacttaaaccatcagatgaaggtgttaaa 

gtaacacaagagttaggtcatgaggatcttatggctgcttatgtggaaaa 

cacaagcattaccattaagaaacctaatgagctttcactagccttaggtt 

taaaaacaattgccactcatggtattgctgcaattaatagtgttccttgg 

AGTAAAATTTTGGCTTATGTCAAACCATTCTTAGGACAAGCAGCAATTAC 

aacatcaaattgcgctaagagattagcacaacgtgtgtttaacaattata . 

TGCCTTATGTGTTTACATTATTGTTCCTATTGTGTACTTTTACTAAAAGT 

ACCAATTCTAGAATTAdAGCTTCACTACCTACAACTATTGCTAAAAATAG 

TGTTAAGAGTGTTGCTAAATTATGTTTGGATGCCGGCATTAATTATGTGA 

AGTCACCCAAATTTTCTAAATTGTTCACAATCGCTATGTGGCTATTGTTG 

TTAAGTATTTGCTTAGGTTCTCTAATCTGTGTAACTGCTGCTTTTGGTGT 

ACTCTTATCTAATTTTGGTGCTCCTTCTTATTGTAATGGCGTTAGAGAAT 

TGTATCTTAATTCGTCTAACGTTACTACTATGGATTTCTGTGAAGGTTCT 

TTTCCTTGCAGCATTTGTTTAAGTGGATTAGACTCCCTTGATTCTTATCC 

AGCTCTTGAAACCATTCAGGTGACGATTTCATCGTACAAGCTAGACTTGA 

CAATTTTAGGTCTGGCCGCTGAGTGGGTTTTGGCATATATGTTGTTCACA 

AAATTCTTTTATTTATTAGGTCTTTCAGCTATAATGCAGGTGTTCTTTGG 

CTATTTTGCTAGTCATTTCATCAGCAATTCTTGGCTCATGTGGTTTATCA 

TTAGTATTGTACAAATGGCACCCGTTTCTGCAATGGTTAGGATGTACATC 

TTCTTTGCTTCTTTCTACTACATATGGAAGAGCTATGTTCATATCATGGA 

TGGTTGCACCTCTTCGACTTGCATGATGTGCTATAAGCGCAATCGTGCCA 

CACGCGTTGAGTGTACAACTATTGTTAATGGCATGAAGAGATCTTTCTAT 

GTCTATGCAAATGGAGGCCGOXSGCTTCTGCAAGACTCACAATTGGAATTG 

TCTCAATTGTGACACATTTTGCACTGGTAGTACATTCATTAGTGATGAAG 

TTGCTCGTGATTTGTCACTCCAGTTTAAAAGACCAATCAACCCTACTGAC 

CAGTCATCGTATATTGTTGATAGTGTTGCTGTGAAAAATGGCGCGCTTCA 

CCTCTACTTTGACAAGGCTGGTCAAAAGACCTATGAGAGACATCCGCTCT 

CCCATTTTGTCAATTTAGACAATTTGAGAGCTAACAACACTAAAGGTTCA 

CTGCCTATTAATGTCATAGTTTTTGATGGCAAGTCCAAATGCGACGAGTC 

TGCTTCTAAGTCTGCTTCTGTGTACTACAGTCAGCTGATGTGCCAACCTA 

TTCTGTTGCTTGACCAAGCTCTTGTATCAGACGTTGGAGATAGTACTGAA 

GTTTCCGTTAAGATGTTTGATGCTTATGTCGACACCTTTTCAGCAACTTT 

TAGTGTTCCTATGGAAAAACTTAAGGCACTTGTTGCTACAGCTCACAGCG 

AGTTAGCAAAGGGTGTAGdTTTAGATGGTGTCCTTTCTACATTCGTGTCA 

GCTGCCCGACAAGGTGTTGTTCATACCGATGTTGACACAAAGGATGTTAT 

TGAATGTCTCAAACTTTCACATCACTCTGACTTAGAAGTGACAGGTGACA 

GXTGTAACAATTTCATGCTCACCTATAATAAGGTTGAAAACATGACGCCC 

AGAGATCTTGGCGCATGTATTGACTGTAATGCAAGGCATATCAATGCCCA 

AGTAGCAAAAAGTCACAATGTTTCACTCATCTGGAATGTAAAAGACTACA 

TGTCTTTATCTGAACAGCTGCGTAAACAAATTCGTAGTGCTGCCAAGAAG 

AACAACATACCTTTTAGACTAACTTGTGCTACAACTAGACAGGTTGTCAA 

TGTCATAACTACTAAAATCTCACTCAAGGGTGGTAAGATTGTTAGTACTT 

GTTTTAAACTTATGCTTAAGGCCAOVTTATTGTGCGTTCTTGCTGCATTG 
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GTTTGTTATATCGTTATGCCAGTACy^TACATTGTCAATCCATGATGGTTA 

CACAAATGAAATCATTGGTTACAAAGCCATTCAGGATGGTGTCACTCGTG 

ACATCATTTCTACTGATGATTGTTTTGCAAATAAACATGCTGGTTTTGAC 

GCATGGTTTAGCCAGCGTGGTGGTTCATACAAAAATGACAAAAGCTGCCC 

TGTAGTAGCTGCTATCATTACAAGAGAGATTGGTTTCATAGTGCCTGGCT 

TACCGGGTACTGTGCTGAGAGCAATCAATGGTGACTTCTTGCATTTTCTA 

CCTCGTGTTTTTAGTGCTGTTGGCAACATTTGCTACACACCTTCCAAACT 

CATTGAGTATAGTGATTTTGCTACCTCTGCTTGOSTTCTTGCTGCTGM 

GTACAATTTTTAAGGATGCTATGGGOU^CCTGTGCCATATTGTTATG^ 

ACTAATTTGCTAGAGGGTTCTATTTCTTATAGTGAGCTTCGTCCAGACAC ' 

TCGTTATGTGCTTATGGATGGTTCCATCATACAGTTTCCTAACACTTACC 

TGGAGGGTTCTGTTAGAGTAGTAACAACTTTTGATGCTGAGTACTGTAGA. 

CATGGTACATGCGAAAGGTCAGAAGTAGGTATTTGCCTATCTACCAGTGG 

TAGATGGGTTCTTAATAATGAGCATTACAGAGCTCTATCAGGAGTTTTCT 

GTGGTGTTGATGCGATGAATCTCATAGCTAACATCTTTACTCCTCTTGTG 

CAACCTGTGGGTGCTTTAGATGTGTCTGCTTCAGTAGTGGCTGGTGGTAT 

TATTGCCATATTGGTGACTTGTGCTGCCTACTACTTTATGAAATTCAGAC 

GTGTTTTTGGTGAGTACAACCATGTTGTTGCTGCTAATGCACTTTTGTTT 

TTGATGTCTTTCACTATACTCTGTCTGGTACCAGCTTACAGCTTTCTGCC 

GGGAGTCTACTCAGTCTTTTACTTGTACTTGACATTCTATTTCACCAATG 

ATGTTTCATTCTTGGCTCACCTTCAATGGTTTGCCATGTTTTCTCCTATT 

GTGCCTTTTTGGATAACAGCAATCTATGTATTCTGTATTTCTCTGAAGCA 

CTGCCATTCGTTCTTTAACAACTATCTTAGGAAAAGAGTCATGTTTA^ 

GAGTTACATTTAGTACCTTCGAGGAGGCTGCTTTGTGTACCTTTTTGCTC 

AACAAGGAAATGTACCTAAAATTGCGTAGCGAGACACTGTTGCCACTTAC 

ACAGTATAACAGGTATCTTGCTCTATATAACAAGTACAAGTATTTCAGTG 

GAGCCTTAGATACTACCAGCTATCGTGAAGCAGCTTGCTGCCACTTAGCA 

AAGGCTCTAAATGACTTTAGCAACTCAGGTGCTGATGTTCTCTACCAACC 

ACCACAGACATCAATCACTTCTGCTGTTCTGCAGAGTGGTTTTAGGAAAA 

TGGCATTCCCGTCAGGCAAAGTTGAAGGGTGCATGGTACAAGTAACCTGT 

GGAACTACAACTCTTAATGGATTGTGGTTGGATGACACAGTATACTGTCC 

AAGACATGTCATTTGCACAGCAGAAGACATGCTTAATCCTAACTATGAAG 

ATCTGCTCATTCGCAAATCCAACCATAGCTTTCTTGTTCAGGCTGGCAAT 

GTTCAACTTCGTGTTATTGGCCATTCTATGCAAAATTGTCTGCTTAGGCT 

TAAAGTTGATACTTCTAACCCTAAGACACCCAAGTATAAATTTGTCCGTA 

TCCAACCTGGTCAAACATTTTCAGTTCTAGCATGCTACAATGGTTCACC^ 

TCTGGTGTTTATCAGTGTGCCATGAGACCTAATCATACCATTAAAGGTTC 

TTTCCTTAATGGATCATGTGGTAGTGTTGGTTTTAACATTGATTATGATT 

GCGTGTCTTTCTGCTATATGCATCATATGGAGCTTCCAACAGGAGTACAC 

GCTGGTACTGACTTAGAAGGTAAATTCTATGGTCCATTTGTTGACAGACA 

AACTGCACAGGCTGCAGGTACAGACACAACCATAACATTAAATGTTTTGG 

CATGGCTGTATGCTGCTGTTATCAATGGTGATAGGTGGTTTCTTAATAGA 

TTCACCACTACTTTGAATGACTTTAACCTTGTGGCAATGAAGTACAACTA 

TGAACCTTTGACACAAGATCATGTTGACATATTGGGACCTCTTTCTGCTC 

AAACAGGAATTGCCGTCTTAGATATGTGTGCTGCTTTGAAAGAGCTGCTG 

CAGAATGGTATGAATGGTCGTACTATCCTTGGTAGCACTATTTTAGAAGA 

TGAGTTTACACCATTTGATGTTGTTAGACAATGCTCTGGTGTTACCTTCC 

AAGGTAAGTTCAAGAAAATTGTTAAGGGCACTCATCATTGGATGCTTTTA 

ACTTTCTTGACATCACTATTGATTCTTGTTCAAAGTACACAGTGGTCACT 

GTTTTTCTTTGTTTACGAG7ATGCTTTCTTGCCATTTACTCTTGGTATTA 

TGGCAATTGCTGCATGTGCTATGCTGCTTGTTAAGCATAAGCACGCATTC 

TTGTGCTTGTTTCTGTTACCTTCTCTTGCAACAGTTGCTTACTTTAATAT 

GGTCTACATGCCTGCTAGCTGGGTGATGCGTATCATGACATGGCTTGAAT 

TGGCTGACACTAGCTTGTCTGGTTATAGGCTTAAGGATTGTGTTATGTAT 

GCTTCAGCTTTAGTTTTGCTTATTCTCATGACAGCTCGCACTGTTTATGA 

TGATGCTGCTAGACGTGTTTGGACACTGATGAATGTCATTACACTTGTTT 

ACAAAGTCTACTATGGTAATGCTTTAGATCAAGCTATTTCCATGTGGGCC 
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TTAGTTATTTCTGTAACCTCTAACTATTCTGGTGTCGTTACGACTATCAT 

GTTTTTAGCTAGAGCTATAGTGTTTGTGTGTGTTGAGTATTACCCATTGT 

TATTTATTACTGGCAACACCTTACAGTGTATCATGCTTGTTTATTGTTTC 

TTAGGCTATTGTTGCTGCTGCTAC'TTTGGCCTTTTCTGTTTACTCAACCG 

TTACTTCAGGCTTACTCTTGGTGTTTATGACTACTTGGTCTCTACACAAG 

AATTTAGGTATATGAACTCCCAGGGGCTTTTGCCTCCTAAGAGTAGTATT 

GATGCTTTCAAGCTTAAtATTAAGTTGTTGGGTATTGGAGGTAAACCATG 

TATCAAGGTTGCTACT6TACAGTCTAAAATGTCTGACGTAAAGTGCACAT 

CTGTGGTACTGCTCTCGGTTCTTCAACAACTTAGAGTAGAGTCATCTTCT 

AAATTGTGGGCACAATGTGTACAACTCCACAATGATATTCTTCTTGCAAA 

AGACACAACTGAAGCTTTCGAGAAGATGGTTTCTCTTTTGTCTGTTTTGC 

TATCCATGCAGGGTGCTGTAGACATTAATAGGTTGTGCGAGGAAAOXSCTC 

GATAACCGTGCTACTCTTCAGGCTATTGCTTCAGAATTTAGTTCTTTACC 

ATCATATGCCGCTTATGCCACTGCCCAGGAGGCCTATGAGCAGGCTGTAG 

CTAATGGTGATTCTGAAGTCGTTCTCAAAAAGTTAAAGAAATCTTTGAAT 

GTGGCTAAATCTGAGTTTGACCGTGATGCTGCCATGCAACGCAAGTTGGA 

AAAGATGGCAGATCAGGCTATGACCCAAATGTACAAACAGGCAAGATCTG 

AGGACAAGAGGGCAT^GTAACTAGTGCTATGCAAACAATGCTCTTCACT 

ATGCTTAGGAAGCTTGATAATGATGCACTTAACAACATTATCAACAATGC 

GCGTGATGGTTGTGTTCCACTCAACATCATACCATTGACTACAGCAGCCA . 

AACTCATGGTTGTTGTCCCTGATTATGGTACCTACAAGAACACTTGTGAT 

GGTAACACCTTTACATJiTGCAtCTGCACTCTGGGAAATCCAGCAAGTTGT 

TGATGCGGATAGCAAGATTGTTCAACTTAGTGAAATTAACATGGACAATT 

CACCAAATTTGGCTTGGCCTCTTATTGTTACAGCTCTAAGAGCCAACTCA 

GCTGTTAAACTACAGAATAATGAACTGAGTCCAGTAGCACTACGACAGAT 

GTCCTGTGCGGCTGGTACCACACAAACAGCTTGTACTGATGACAATGCAC 

TTGCCTACTATAACAATTCGAAGGGAGGTAGGTTTGTGCTGGCATTACTA 

TCAGACCACCAAGATCTCAAATGGGCTAGATTCCCTAAGAGTGATGGTAC 

AGGTACAATTTACACAGAACTGGAACCACCTTGTAGGTTTGTTACAGACA 

a^CCAAAAGGGCCTAAAGTGAAATACTTGTACTTCATCAAAGGCTTAAAC 

AACCTAAATAGAGGTATGGTGCTGGGCAGTTTAGCTGCTACAGTACGTCT 

TCAGGCTGGAAATGCTACAGAAGTACCTGCCAATTCAACTGTGCTTTCCT 

TCTGTGCTTTTGCAGTAGACCCTGCTAAAGCATATAAGGATTACCTAGCA 

AGTGGAGGACAACCAATCACCAACTGTGTGAAGATGTTGTGTACACACAC 

TGGTACAGGACAGGCAATTACTGTAACACCAGAAGCTAACATGGACCAAG 



AGTCCTTTGGTGGTGCTTCATGTTGTCTGTATTGTAGATGCCACATTGAC 
CATCCAAATCCTAAAGGATTCTGTGACTTGAAAGGTAAGTACGTCCAAAT 

ACCTACCACTTGTGCTAATGACCCAGTGGGTTTTACACTTAGAAACACAG 
TCTGTACCGTCTGCGGAATGTGGAAAGGTTATGGCTGTAGTTGTGACCAA 
CTCCGCGAACCCTTGATGCAGTCTGCGGATGCATCAACGTTTTTAAACGG 
GTTTGCGGTGTAAGTGCAGCCCGTCTTACACCGTGCGGCACAGGCACTAG 
TACTGATGTCGTCTACAGGGCTTTTGATATTTACAACGAAAAAGTTGCTG 
GTTTTGCAAAGTTCCTAAAAACTAATTGCTGTCGCTTCCAGGAGAAGGAT 
GAGGAAGGCAATTTATTAGACTCTTACTTTGTAGTTAAGAGGCATACTAT 
GTCTAACTACCAACATGAAGAGACTATTTATAACTTGGTTAAAGATTGTC 
CAGCGGTTGCTGTCCATGACTTTTTCAAGTTTAGAGTAGATGGTGACATG 
GTACCACATATATCACGTCAGCGTCTAACTAAATACACAATGGCTGATTT 
AGTCTATGCTCTACGTCAfTTTGATGAGGGTAATTGTGATACATTAAAAG 
AAATACTCGTCAC ATACAATTGCTGTGATGATGATTATTTCAATAAGAAG 
GATTGGTATGACTTCGTAGAGAATCCTGACATCTTACGCGTATATGCTAA 
CTTAGGTGAGCGTGTACGCCa^TCATTATTAAAGACTGTACAATTCTGCXS 
ATGCTATGCGTGATGCAGGCATTGTAGGCGTACTGACATTAGATAATCAG 
GATCTTAATGGGAACTGGTACGATTTCGGTGATTTCGTACAAGTAGCACC 
AGGCTGCGGAGTTCCTATTGTGGATTCATATTACTCATTGCTGATGCCCA 
TCCTCACTTTGACTAGGGCATTGGCTGCTGAGTCCCATATGGATGCTGAT 
CTCGCAAAACCACTTATTAAGTGGGATTTGCTGAT^TATGATTTTACGGA 
AGAGAGACTTTGTCTCTTCGACCGTTATTTTAAATATTGGGACCAGACAT 
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ACCATCCCAATTGTATTAACTGTTTGGATGATAGGTGTATCCTTCATTGT 

GCAAACTTTAATGTGTTATTTTCTACTGTGTTTCCACCTACAAGTTTTGG 

ACCACTAGTAAGAAAAATATTTGTAGATGGTGTTCCTTTTGTTGTTTCAA 

CTGGATACCATTTTCGTGAGTTAGGAGTCGTACATAATCAGGATGTAAAC 

TTACATAGCTCGCGTCTCAGTTTCAAGGAACTTTTAGTGTATGCTGCTGA 

TCCAGCTATGCATGCAGCTTCTGGCAATTTATTGCTAGATAAACGCACTA 

CATGCTTTTCAGTAGCTGCACTAACAAACAATGTTGCTTTTCAAACT^ 

AAACCCGGTAATTTTAATAAAGACTTTTATGACTTTGCTGTGTCTAAAGG 

TTTCTTTAAGGAAGGAAGTTCTGTTGAACTAAAACACTTCTTCTTTGCTC 

AGGATGGCAACGCTGCTATCAGTGATTATGACTATTATCGTTATAATCTG' 

CCAACAATGTGTGATATCAGACAACTCCTATTCGTAGTTGAAGTTGTTGA 

TAAATACTTTGATTGTTACGATGGTGGCTGTATTAATGCCAACCAAGTAA 

TCGTTAACAATCTGGATAAATCAGCTGGTTTCCCATTTAATATATGGGGT 

AAGGCTAGACTTTATTATGACTCAATGAGTTATGAGGATCAAGATGCACT 

TTTCGCGTATACTAAGCGTAATGTCATCCCTACTATAACTCAAATGAATC 

TTAAGTATGCCATTAGTGCAAAGAATAGAGCTCGCACCGTAGCTGGTGTC 

TCTATCTGTAGTACTATGACAAATAGACAGTTTCATCAGAAATTATTGAA 

GTCAATAGCCGCCACTAGAGGAGCTACTGTGGTAATTGGAACAAGCAAGT 

TTTACGGTGGCTGGCATAATATGTTAAAAACTGTTTACAGTGATGTAGAA 

ACTCCACACCTTATGGGTTGGGATTATCCAAAATGTGACAGAGCCATGCC 

TAACATGCTTAGGATAATGGCCa?CTCTTGTTCTTGCTCGCAAACATAACA 

CTTGCTGTAACTTATCACACCGTTTCTACAGGTTAGCTAACGA6TGTGCG 

CAAGTATTAAGTGAGATGGTCATGTGTGGCGGCTCACTATATGTTAAACC 

AGGTGGAACATCATCCGGTGATGCTACAACTGCTTATGCTAATAGTGTCT 

TTAACATTTGTCAAGCTGTTACAGCCAATGTAAATGCACTTCTTTCAACT 

GATGGTAATAAGATAGCTGACAAGTATGTCCGCAATCTACAACACAGGCT 

CTATGAGTGTCTCTATAGAAATAGGGATGTTGATCATGAATTCGTGGATG 

AGTTTTACGCTTACCTGCGTAAACATTTCTCCATGATGATTCTTTCTGAT 

GATGCCGTTGTGTGCTATAACAGTAACTATGCGGCTCAAGGTTTAGTAGC 

TAGCATTTAGAACTTTAAGGCAGTTCTTTATTATCAAAATAATGTGTTCA 

TGTCTGAGGCAAAATGTTGGACTGAGACTGACCTTACTAAAGGACCTCAC 

GAATTTTGCTCACAGCATACAATGCTAGTTAAACAAGGAGATGATTACGT 

GTACCTGCCTTACCCAGATCCATCAAGAATATTAGGCGCAGGCTGTTTTG 

TCGATGATATTGTCAAAACAGATGGTACACTTATGATTGAAAGGTTCGTG 

TCACTGGCTATTGATGCTTACCCACTTACAAAACATCCTAATCAGGAGTA 

TGCTGATGTCTTTCACTTGTATTTACAATACATTAGAAA6TTACATGATG 

AGCTTACTGGCCACATGTTGGACATGTATTCCGTAATGCTAACTAATGAT 

AACACCTCACGGTACTGGGAACCTGAGTTTTATGAGGCTATGTACACACC 

ACATACAGTCTTGCAGGCTGTAGGTGCTTGTGTATTGTGCAATTCACAGA 

CTTCACTTCGTTGCGGTGCCTGTATTAGGAGACCATTCCTATGTTGCAAG 

TGCTGCTATGACCATGTCATTTCAACATCACACAAATTAGTGTTGTCTGT 

TAATCCCTATGTTTGCAATGCCCCAGGTTGTGATGTCACTGATGTGACAC 

AACTGTATCTAGGAGGTATGAGCTATTATTGCAAGTCACATAAGCCTCCC 

ATTAGTTTTCCATTATGTGCTAATGGTCAGGTTTTTGGTTTATACAAAAA 

CACATGTGTAGGCAGTGACAATGTCACTGACTTCAATGCGATAGCAACAT 

GTGATTGGACTAATGCTGGCGATTACATACTTGCCAACACTTGTACTGAG 

AGACTCAAGCTTTTCGCAGCAGAAACGCTCAAAGCCACTGAGGAAACATT 

TAAGCTGTCATATGGTATTGCCACTGTACGCGAAGTACTCTCTGACAGAG 

AATTGCATCTTTCATGGGAGGTTGGAAAACCTAGACCACCATTGAACAGA 

AACTATGTCTTTACTGGTTACCGTGTAACTAAAAATAGTAAAGTACAGAT 

TGGAGAGTACACCTTTGAAAAAGGTGACTATGGTGATGCTGTTGTGTACA 

GAGGTACTACGACATACAAGTTGAATGTTGGTGATTACTTTGTGTTGACA 

TCTCACACTGTAATGCCACTTAGTGCACCTACTCTAGTGCCACAAGAGCA 

CTATGTGAGAATTACTGGCTTGTACCCAACACTCAACATCTCAGAOXSAGT 

TTTCTAGCAATGTTGCAAATTATCAAAAGGTCGGCATGCAAAAGTACTCT 

ACACTCCAAGGACCACCTGGTACTGGTAAGAGTCATTTTGCCATCGGACT 

TGCTCTCTATTACCCATCTGCTCGCATAGTGTATACGGCAT6CTCTCATG 
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CAGCTGTTGATGCCCTATGTGAAAAGGGATTAAAATATTTGCCCATAGAT 
AAATGTAGTAGAATCATACCTGCGCGTGCGCGCGTAGAGTGTTTTGATAA 

ATTCAAAGTGAATTCAACACTAGAACAGTATGTTTTCTGCACTGTAAATG 
CATTGCCAGAAACAACTGCTGACATTGTAGTCTTTGATGAAATCTCTATG 
GCTACTAATTATGACTTGAGTGTTGTCAATGCTAGACTTCGTGCAAAACA 

ctacgtctatattggcgatcctgctcaattaccagccccccgcacattgc 
tgactaaaggcacactA6aaccagaatattttaattcagtgtgcagactt . . 

ATGAAAACAATAGGTCCAGACATGTTCCTTGGAACTTGTCGCCGTTGTCC 

tgctgaaattgttgacactgtgagtgctttagtttatgacaataagctaa 

AAGCACACAAGGATAAGTCAGCTCAATGCTTCAAAATGTTCTACAAAGGT 

GrTATTACACATGATGTTTCATCTGCAATCAACAGACCTCAAATAGGCGT 

TGTAAGAGAATTTCTTACACGCAATCCTGCTTGGAGAAAAGCTGTTTTTA 

TCTCACCTTATAATTCACAGAACGCTGTAGCTTCAAAAATCTTAGGATTG 

CCTACGCAGACTGTTGATTCATCACAGGGTTCTGAATATGACTATGTCAT . 

ATTCACT^CAAACTACTGAAACAGCACACTCTTGTAATGTCAACCGCTTCA 

ATGTGGCTATCACAAGGGCAAAAATTGGCATTTTGTGCATAATGTCTGAT 

AGAGATCTTTATGACAAACTGCAATTTACAAGTCTAGAAATACCACGTCG 

CAATGTGGCTACATTACAAGCAGAAAATGTAACTGGACTTTTTAAGGACT 

GTAGTAAGATCATTACTGGTCTTCATCCTACACAGGCACCTACACACCTC 

AGCGTTGATATAAAGTTCAAGACTGAAGGATTATGTGTTGACATACCAGG . 

CATACCAAAGGACATGACCTACCGTAGACTCATCTCTATGATGGGTTTCA 

AAATGAATTACCAAGTdAATGGTTACCCTTUiTATGTTTATCACCCGCGAA 

GAAGCTATTCGTCACGTTCGTGCGTGGATTGGCTTTGATGTAGAGGGCTG 

TCATGCAACTAGAGATGCTGTGGGTACTAACCTACCTCTCCAGCTAGGAT 

TTTCTACAGGTGTTAACTTAGTAGCTGTACCGACTGGTTATGTTGACACT 

GAAAATAACACAGAATTCACCAGAGTTAATGCAAAACCTCCACCAGGTGA 

CCAGTTTAAACATCTTATACCACTCATGTATAAAGGCTTGCCCTGGAATG 

TAGTGCGTATTT^GATAGTACAAATGCTCAGTGATACACTGAAAGGATTG 

TCAGACAGAGTCGTGTTCGTCCTTTGGGCGCATGGCTTTGAGCTTACATC 

AATGAAGTACTTTGTCAAGATTGGACCTGAAAGAACGTGTTGTCTGTGTG 

ACAAACGTGCAACTTGCTTTTCTACTTCATCAGATACTTATGCCTGCTGG 

AATCATTCTGTGGGTTTTGACTATGTCTATAACCCATTTATGATTGATGT 

TCAGCAGTGGGGCTTTACGGGTAACCTTCAGAGTAACCATGACCAACATT 

GCCAGGTACATGGAAATGCACATGTGGCTAGTTGTGATGCTATCATGACT 

AGATGTTTAGCAGTCCATGAGTGCTTTGTTAAGCGCGTTGATTGGTCTGT 

TGAATACCCTATTATAGGAGATGAACTGAGGGTTAATTCTGCTTGCAGAA 

AAGTACAACACATGGTTGTGAAGTCTGCATTGCTTGCTGATAAGTTTCCA 

GTTCTTCATGACATTGGAAATCCAAAGGCTATCAAGTGTGTGCCTCAGGC 

TGAAGTAGAATGGAAGTTCTACGATGCTCAGCCATGTAGTGACAAAGCTT 

ACAAAATAGAGGAACTCTTCTATTCTTATGCTACACATCACGATAAATTC 

ACTGATGGTGTTTGTTTGTTTTGGAATTGTAACGTTGATCGTTACCCAGC 

CAATGCAATTGTGTGTAGGTTTGACACAAGAGTCTTGTCAAACTTGAACT 

TACCAGGCTGTGATGGTGGTAGTTTGTATGTGAATAAGCATGCATTCCAC 

ACTCCAGCTTTCGATAAAAGTGCATTTACTAATTTAAAGCAATTGCCTTT 

CTTTTACTATTCTGATAGTCCTTGTGAGTCTCATGGCAAACAAGTAGTGT 

CGGATATTGATTATGTTCCACTCAAATCTGCTACGTGTATTACACGATGC 

AATTTAGGTGGTGCTGTTTGCAGACACCATGCAAATGAGTACCGACAGTA 

CTTGGATGCATATAATATCATGATTTCTGCTGGATTTAGCCTATGGATTT 

ACAAACAATTTGATACTTATAACCTGTGGAATACATTTACCAGGTTACAG 

AGTTTAGAAAATGTGGCTTATAATGTTGTTAATAAAGGACACTTTGATGG 

ACACGCCGGCGAAGCACCTGTTTCCATCAOTAATAATGCTGTOTACACMi 

aggtagatggtattgatgtggagatctttgaaaataagacaacacttcct 

GTTAATGTTGCATTTGAGCTTTGGGCTAAGCGTAACATTAAACCAGTGCC 

agagattaagatactcaataatttgggtgttgatatcgctgctaatactg 

TAATCTGGGACTACAAAAGAGAAGCCCCAGCACATGTATCTACAATAGGT 
GTCTGCAO^TGACTGACATTGCCAAGAAACCTACTGAGAGTGCTTGTTC 
TTCACTTACTGTCTTGTTTGATGGTAGAGTGGAAGGACAGGTAGACCTTT 
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TTAGAAACGCCCGTAATGGTGTTTTAATAACAGAAGGTTCAGTCAAAGGT 

CTAACACCTTCAAAGGGACCAGCACAAGCTAGCGTCAATGGAGTCACATT 

AATTGGAGAATCAGTAAAAACACAGTTTAACTACTTTAAGAAAGTAGACG 

GCATTATTCAACAGTTGCCTGAAACCTACTTTACTCAGAGCAGAGACTTA 

GAGGATTTTAAGCCCAGATCACAAATGGAAACTGACTTTCTCGAGCTCGC 

TATGGATGAATTCATACAGCGATATAAGCTCGAGGGCTATGCCTTCGAAC 

ACATCGTTTATGGAGATTTCAGTCATGGACAACTOKSGCGGTCTTCATTTA 

ATGATAGGCTTAGCCAAGCGCTCACAAGATTCACCACTTAAATTAGAGGA i 

TTTTATCCCTATGGACAGCACAGTGAAAAATTACTTCATAACAGATGCGC ' 

AAACAGGTTCATCAAAATGTGTGTGTTCTGTGATTGATCTTTTACTTGAT 

GACTTTGTCGAGATAATAAAGTCACAAGATTTGTCAGTGATTa?CAAAAGT 

GGTCAAGGTTACAATTGACTATGCTGAAATTTCATTCATGCTTTGGTGTA 

AGGATGGACATGTTGAAACCTTCTACCCAAAACTACAAGCAAGTCAAGCG 

TGGCAACCAGGTGTTGCGATGCCTAACTTGTACAAGATGCAAAGAATGCT 

TCTTGAAAAGTGTGACCTTCAGAATTATGGTGAAAATGCTGTTATACCAA 

AAGGAATAATGATGAATGTCGCAAAGTATACTCAACTGTGTCAATACTTA 

AATACACTTACTTTAGCTGTACCCTACAACATGAGAGTTATTCACTTTGG 

TGCTGGCTCTGATAAAGGAGTTGCACCAGGTACAGCTGTGCTCAGACAAT 

GGTTGCCAACTGGCACACTACTTGTCGATTCAGATCTTAATGACTTCGTC 

TCCGACGCAGATTCTACTTTAATTGGAGACTGTGCAACAGTACATACGGC 

TAATAAATGGGACCTTATTATTAGCGATATCTATGACCCTAGGACCAAAC 

ATGTGACAAAAGAGAATGACTCTAT^GAAGGGTTTTTCACTTATCTGTGT 

GGATTTATAAAGCAAAAACTAGCCCT6GGTGGTTCTATAGCT6TAAAGAT 

AACAGAGCATTCTTGGAATGCTGACCTTTACAAGCTTATGGGCCATTTCT 

CATGGTGGACAGCTTTTGTTACAAATGTAAATGCATCATCATCGGAAGCA 

TTTTTAATTGGGGCTAACTATCTTGGCAAGCCGAAGGAACAAATTGATGG 

CTATACCATGCATGCTAACTACATTTTCTGGAGGAACACAAATCCTATCC 

AGTTGTCTTCCTATTCACTCTTTGACATGAGCAAATTTCCTCTTAAATTA 

AGAGGAACTGCTGTAATGTCTCTTAAGGAGAATCAAATCAATGATATGAT 

TTATTCTCTTCTGGAAAAAGGTAGGCTTATCATTAGAGAAAACAACAGAG 

TTGTGGTTTCAAGTGATATTCTTGTTAACAACTAAACGAACATGTTTATT 

TTCTTATTATTTCTTACTCTCACTAGTGGTAGTGACCTTGACCGGTGCAC 

CACTTTTGATGATGTTCAAGCTCCTAATTACACTCAACATACTTCATCTA 

TGAGGGGGGXTTACTATCCTGATGAAATTTTTAGATCAGACACTCTTTAT 

TTAACTCAGGATTTATTTCTTCCATTTTATTCTAATGTTACA6GGTTTCA 

TACTATTAATCATACGTTTGGCAACCCTGTCATACCTTTTAAGGATGGTA 

TTTATTTTGCTGCCACAGAGAAATCAAATGTTGTCCGTGGTTGGGTTTTT 

GGTTCTACCATGAACAACAAGTCACAGTCGGTGATTATTATTAACAATTC . 

TACTAATGTTGTTATACGAGCATGTAACTTTGAATTGTGTGACAACCCTT 

TCTTTGCTGTTTCTAAACCCATGGGTACACAGACACATACTATGATATTC 

GATAATGCATTTAATTGCACTTTCGAGTACATATCTGATGCCTTTTCGCT 

TGATGTTTCAGAAAAGTCAGGTAATTTTAAACACTTACGAGAGTTTGTGT 

TTAAAT^TAAAGATGGGTTTCTCTATGTTTATAAGGGCTATCAACCTATA 

GATGTAGTTCGTGATCTACCTTCTGGTTTTAACACTTTGAAACCTATTTT 

TAAGTTGCCTCTTGGTATTAACATTACAAATTTTAGAGCCATTCTTACAG 

CCTTTTCACCTGCTCAAGACATTTGGGGCACGTCAGCTGCAGCCTATTTT 

GTTGGCTATTTAAAGCCAACTACATTTATGCTCAAGTATGATGAAAATGG 

TACAATCACAGATGCTGTTGATTGTTCTCAAAATCCACTTGCTGAACTCA 

AATGCTCTGTTAAGAGCTTTGAGATTGACAAAGGAATTTACCAGACCTCT 

AATTTCAGGGTTGTTCCCTCAGGAGATCTTGTGAGATTCCCTAATA 

AAACTTGTGTCCTTTTGGAGAGGTTTTTAATGCTACTAAATTCCCTTCTG 

TCTATGCATGGGAGAGAAAAAAAATTTCTAATTGTGTTGCTGATTACTCT 

GTGCTCTACAACTCAACATTTTTTTCAACCTTTAAGTGCTATGGCGTTTC 

TGCCACTAAGTTGAATGATCTTTGCTTCTCCAATGTCTATGCAGATTCTT 

TTGTAGTCAAGGGAGATGATGTAAGACAAATAGCGCCAGGACAAACTGGT 

GTTATTGCTGATTATAATTATAAATTGCCAGATGATTTCATGGGTTGTGT 

CCTTGCTTGGAATACTAGGAACATTGATGCTACTTCAACTGGTAATTATA 
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ATTATAAATATAGGTATCTTAGACATGGCAAGCTTAGGCCCTTTGAGAGA 

GACATATCTAATGTGCCTTTCTCCCCTGATGGCAAACCTTGCACCCCACC 

TGCTCTTAATTGTTATTGGCCATTAAATGATTATGGTTTTTACACCACTA 

CTGGCATTGGCTACCAACCTTACAGAGTTGTAGTACTTTCTTTTGAACTT 

TTAAATGCACCGGCCACGGTTTGTGGACCAAAATTATCCACTGACCTTAT 

TAAGAACCAGTGTGTCAATTTTAATTTTAATGGACTCACTGGTACTGGTG 

TGTTAACTCCTTCTTCAi^GAGATTTCAACCATTTCAAC^ 

GATGTTTCTGATTTCACTGATTCCGTTCGAGATCCTAAAACATCTGAAAT 

ATTAGACATTTCACCTTGCGCTTTTGGGGGTGTAAGTGTAATTACACCTG 

GAACAAATGCTTCATCTGAAGTTGCTGTTCTATATCAAGATGTTAACTGC 

ACTGATGTTTCTACAGCAATTCATGCAGATCAACTCACACCAGCTTGGCG 

CATATATTCTACTGGAAACAATGTATTCCAGACTCAAGCAGGCTGTCTTA 

TAGGAGCTGAGCATGTCGACACTTCTTATGAGTGCGACATTCCTATTGGA 

GCTGGCATTTGTGCTAGTTACCATACAGTTTCTTTATTACGTAGTACTAG. • 

CCAAAT^TCTATTGOX^CTTATACTATGTCTTTAGGTGCTGATAGTTCAA 

TTGCTTACTCTAATAACACCATTGCTATACCTACTAACTTTTCAATTAGC 

ATTACTACAGAAGTAATGCCTGTTTCTATGGCTAAAACCTCCGTAGATTG 

TAATATGTACATCTGCGGAGATTCTACTGAATGTGCTAATTTGCTTCTCC 

AATATGGTAGCTTTTGCACACAACTAAATCGTGCACTCTCAGGTATTGCT 

GCTGAACAGGATCGCAACACACGTGAAGTGTTCGCTCAAGTCAAACAAAT, 

GTACAAAACCCCAACTTTGAAATATTTTGGTGGTTTTAATTTTTCACAAA 

TATTACCTGACCCTCTi^AAGCCTACTAAGAGGTCTTTTATTGAGGACTTG 

CTCTTTAATAAGGTGACACTCGCTGATGCTGGCTTCATGAAGCAATATGG 

CGAATGCCTAGGTGATATTAATGCTAGAGATCTCATTTGTGCGCAGAAGT 

TCAATGGACTTACAGTGTTGCCACCTCTGCTCACTGATGATATGATTGCT 

GCCTACACTGCTGCTCTAGTTAGTGGTACTGCCACTGCTGGATGGACATT 

TGGTGCTGGCGCTGCTCTTCAAATACCTTTTGCTATGCAAATGGCATATA 

GGTTCAATGGCATTGGAGTTACCCAAAATGTTCTCTATGAGAACCAAAAA 

CAAATCGCCAACCAATTTAACAAGGCGATTAGTCAAATTCT^GAATCACT 

TACAACAACATCAACTGCATTGGGCAAGCTGCAAGACGTTGTTAACCAQA 

ATGCTCAAGCATTAAACACACTTGTTAAACAACTTAGCTCTAATTTTGGT 

GCAATTTCAAGTGTGCTAAATGATATCCTTTCGCGACTTGATAAAGTCQA 

GGCGGAGGTACAAATTGACAGGTTAATTACAGGCAGACTTCAAAGCCTTC 

AAACCTATGTAACACAACAACTAATCAGGGCTGCTGAAATCAGGGCTTCT 

GCTAATCTTGCTGCTACTAAAATGTCTGAGTGTGTTCTTGGACAATCAAA 

aagagttgacttttgtggaAagggctaccaccttatgtccttcccacaag 
cagccccgcatggtgttgtcttcctacatgtcac6tatgtgccatcccag 
gagaggaacttcaccacagcgccagcaatttgtcatgaaggcaaagcata 

CTTCCCTCGTGAAGGTGTTTTTGTGTTTAATGGCACTTCTTGGTTTATTA 
CACAGAGGAACTTCTTTTCTCCACAAATAATTACTACAGACAATACATTT 
GTCTCAGGAAATTGTGATGTCGTTATTGGCATCATTAACAACACAGTTTA 
TGATCCTCTGCAACCTGAGCTTGACTCATTCAAAGAAGAGCTGGACAAGT 
ACTTCAAAAATCATACATCACCAGATGTTGATCTTGGCGACATTTCAGGC 
ATTAACGCTTCTGTCGTCAACATTCAAAAAGAAATTGACCGCCTCAATGA 
GGTCGCTAAAAATTTAAATGAATCACTCATTGACCTTC7UVGAATTGGGAA 
AATATGAGCAATATATTAAATGGCCTTGGTATGTTTGGCTCGGCTTCATT 
GCTGGACTAATTGCCATCGTCATGGTTACAATCTTGCTTTGTTGCATGAC 
TAGTTGTTGCAGTTGCCTCAAGGGTGCATGCTCTTGTGGTTCTTGCTGCA 
AGTTTGATGAGGATGACTCTGAGCCAGTTCTCAAGGGTGTCAAATTACAT 
TACACATAAACGAACTTATGGATTTGTTTATGAGATTTTTTACTCTTAGA 
TCAATTACTGCACAGCCAGTAAAAATTGACAATGCTTCTCCTGCAAGTAC 
TGTTCATGCTACAGCAACGATACCGCTACAAGCCTCACTCCCTTTCGGAT 
GGCTTGTTATTGGCGTTGCATTTCTTGCTGTTTTTCAGAGCGCTACCAAA 
ATAATTGCGCTCAATAAAAGATGGCAGCTAGCCCTTTATAAGGGCTTCCA 
GTTCATTTGCAATTTACTGCTGCTATTTGTTACCATCTATTCACATCTTT 
TGCTTGTCGCTGCAGGTATGGAGGCGCAATTTTTGTACCTCTATGCCTTG 
ATATATTTTCTACAATGCATCAACGCATGTAGAATTATTATGAGATGTTG 
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GCTTTGTTGGAAGTGCAAATCCAAGAACCCATTACTTTATGATGCCAACT 
ACTITGTTTGCTGGCACACACATAACTATGACTACTGTATACCATATAAC 

AGTGTCACAGATACAATTGTCGTTACTGAAGGTGACGGCATTTCAACACC 

AAAACTCAAAGAAGACTACCAAATTGGTGGTTATTCTGAGGATAGGCACT 

CAGGTGTTAAAGACTATGTCGTTGTACATGGCTATTTCACCGAAGTTTAC 

TACCAGCTTGAGTCTACACAAATTACTACAGACACTGGTATTGAAAATGC 

TACATTCTTCATCTTTAACAAGCTTGTTAAAGACCCACCGAATGTGCAAA 

TACACACAATCGACGGCTCTTCAGGAGTOXSCTAATCCAGCAATGGATCCA 

ATTTATGATGAGCCGACGACGACTACTAGCGTGCCTTTGTAAGCACAAGA 

AAGTGAGTACGAACTTATGTACTCATTCGTTTCGGAAGAAACAGGTACGT 

TAATAGTTAATAGCGTACTTCTTTTTCTTGCTTTCGTGGTATTCTTGCTA 

GTCACACTAGCCATCCTTACTGCGCTTCGATTGTGTGCGTACTGCTGCAA 

TATTGTTAACGTGAGTTTAGTAAAACCAACGGTTTACGTCTACTCGCGTG 

TTAAAAATCTGAACTCTTCTGAAGGAGTTCCTGATCTTCTGGTCTAAACG 

AACTAACTATTATTATTATTCTGTTTGGAACTTTAACATTGCTTATCATG 

GCAGACAACGGTACTATTACCGTTGAGGAGCTTAAACAACTCCTGGAACA 

ATGGAACCTAGTAATAGGTTTCCTATTCCTAGCCTGGATTATGTTACTAC 

AATTTGCCTATTCTAATCGGAACAGGTTTTTGTACATAATAAAGCTTGTT 

TTCCTCTGGCTCTTGTGGCCAGTAACACTTGCTTGTTTTGTGCTTGCTGC 

TGTCTACAGAATTAATTGGGTGACTGGCGGGATTGCGATTGCAATGGCTT 

GTATTGTAGGCTTGATGTGGCTTAGCTACTTCGTTGCTTCCTTCAGGCTG 

TTTGCTCGTACCCGCTCAATGTGGTCATTCAACXCAGAAACAAACATTCT 

TCTCAATGTGCCTCTCCGGGGGACAATTGTGACCAGACCGCTCATGGAAA 

GTGAACTTGTCATTGGTGCTGTGATCATTCGTGGTCACTTGCGAATGGCC 

GGACACTCCCTAGGGCGCTGTGACATTAAGGACCTGCCAAAAGAGATCAC 

TGTGGCTACATCACGAACGCTTTCTTATTACAAATTAGGAGCGTCGCAGC 

GTGTAGGCACTGATTCAGGTTTTGCTGCATACAACCGCTACCGTATTGGA 

AACTATAAATTAAATACAGACCACGCCGGTAGCAACGACAATATTGCTTT 

GCTAGTACAGTAAGTGACAACAGATGTTTCATCTTGTTGACTTCCAGGTT 

ACAATAGCAGAGATATTGATTATCATTATGAGGACTTTCAGGATTGCTAT 

TTGGAATCTTGACGTTATAATAAGTTCAATAGTGAGACAATTATTTAAGC 

CTCTAACTAAGAAGAATTATTCGGAGTTAGATGATGAAGAACCTATGGAG 

TTAGATTATCCATAAAACGAACATGAAAATTATTCTCTTCCTGACATTGA 

TTGTATTTACATCTTGCGAGCTATATCACTATCAGGAGTGTGTTAGAGGT 

ACGACTGTACTACTAAAAGAACCTTGCCCATCAGGAACATACGAGGGCAA 

TTC^ICCATTTCACCCTCTTGCTGACAATAAATTTGCACTAACTTGCACTA 

GCACACACTTTGCTTTTGCTTGTGCTGACGGTACTCGACATACCTATCAG 

CTGCGTGCAAGATCAGTTTCACCAAAACTTTTCATCAGACAAGAGGAGGT 

TCAACAAGAGCTCTACTCGCCACTTTTTCTCATTGTTGCTGCTCTAGTAT 

TTTTAATACTTTGCTTCACCATTAAGAGAAAGACAGAATGAATGAGCTCA 

CTTTAATTGACTTCTATTTGTGCTTTTTAGCCTTTCTGCTATTCCTTGTT 

TTAATAATGCTTATTATATTTTGGTTTTCACTCGAAATCCAGGATCTAGA 

AGAACCTTGTACCAAAGTCTAAACGAACATGAAACTTCTCATTGTTTTGA 

CTTGTATTTCTCTATGCAGTTGCATATGCACTGTAGTACAGCGCTGTGCA 

TCTAATAAACCTCATGTGCTTGAAGATCCTTGTAAGGTACAACACTAGGG 

GTAATACTTATAGCACTGCTTGGCTTTGTGCTCTAGGAAAGGTTTTACCT 

TTTCATAGATGGCACACTATGGTTCAAACATGCACACCTAATGTTACTAT 

CAACTGTCAAGATCCAGCTGGTGGTGCGCTTATAGCTAGGTGTTGGTACC 

TTCATGAAGGTCACCAAACTGCTGCATTTAGAGACGTACTTGTTGTTTTA 

AATAAACGAACAAATTAAAATGTCTGATAATGGACCCCAATCAAACCAAC 

GTAGTGCCCCCCGCATTACATTTGGTGGACCCACAGATTCAACTGACAAT 

AACCAGAATGGAGGACGCAATGGGGCAAGGCCAAAACAGCGCCX3ACCCCA 

AGGTTTACCCAATAATACTGCGTCTTGGTTCACAGCTCTCACTCAGCATG 

GCAAGGAGGAACTTAGATTCCCTCGAGGCCAGGGCGTTCCAATCAACACC 

AATAGTGGTCCAGATGACCAAATTGGCTACTACCGAAGAGCTACCCGACG 

AGTTCGTGGTGGTGACGGCAAAATGAAAGAGCTCAGCCCCAGATGGTACT 

TCTATTACCTAGGAACTGGCCCAGT^GCTTCACTTCCCTACGGCGCTAAC 
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AT^GAAGGCATCGTATGGGTTGCAACOJGAGGGAGCCTTGAATACACCCAA 
AGACCACATTGGCACCCGCAATCCTAATAACAATGCTGCCACCGTGCTAC 
AACTTCCTCAAGGAACAACATTGCCAAAAGGCTTCTACGCAGAGGGAAGC 
AGAGGCGGCAGTCAAGCCTCTTCTCGCTCCTCATCACGTAGTCGCGGTAA 
TTCAAGAAATTCAACTCCTGGCAGCAGTAGGGGAAATTCTCCTGCTCGAA 
TGGCTAGCGGAGGTGGTGAAACTGCCCTCGCGCTATTGCTGCTAGACAGA 
TTGAACCAGCTTGAGAGCAAAGTTTCTGGTAAAGGCCAACAACAACAAGG 
CCAAACTGTCACTAAGAAATCTGCT6CTGA6GCATCTAAAAAGCCTCGCC 

aaaaacgtactgccacaaaacagtacaacgtcactcaagcatttgggaga 

CGTGGTCCAGAACAAACCCAAGGAAATTTCGGGGACCAAGACCTAATCA6 

acaaggaactgattacaaacattggccgcaaattgcacaatttgctccaa 

gtgcctctgcattctttggaatgtcacgcattggcatggaagtcacacct 

tcgggaacatggctgacttatcatggagccattaaattggatgacaaaga 

tccacaattcaaagacaacgtcatactgctgaacaagcacattgacgcat . ■ 

acaaaacattcccaccaacagagcctaaaaaggacaaaaagaaaaagact 

gatgaagctcagcctttgccgcagagacaaaagaagcagcccactgtgac 

tcttcttcctgcggctgacatggatgatttctccagacaacttcaaaatt 

CCATGAGTGGAGCTTCTGCTGATTCAACTCAGGCATAAACACTCATGATG 

accacacaaggcagatgggctatgtaaacgttttcgcaattccgtttacg 

atacatagtctactcttgtgcagaatgaattctcgtaactaaacagcaca. 

agtaggtttagttaactttaatctcacatagcaatctttaatcaatgtgt 

aacattagggaggacttgaaagagccaccacattttcatcgaggccacgc 

ggagtacgatcgagggtacagtgaataatgctagggagagctgcctatat 

ggaagagccctaatgtgtaaaattaattttagtagtgctatccccatgtg 

attttaatagcttcttaggagaatgacaaaaaaaaaaaaaaaaaaaaaaa a 

GexiBank Accession No. AV274119.3, SEQ ID NO: 15 
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FAVIGDliKCT TVSINDVDTGAPSISTDIVDVTNGLQ 
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229B 

PEDV 16 ISQEPFDPSGYQLYLHKATNGNTNATARXiR--ICQFPI»IKTL6PTVNZ3VTT6-- 

CCOV NARGKPLLVHVHGDPVSIIlYISAYRDDVQPRPLIiKHGliIiCITKNKIlDyOT'FTSAQWS- 

PRC 

FICV WDDRDKVGLLIAIHGNSKYSLLMVLQDAVEANQPHVAVKICHWKPGNISSYHAFSVNLGD 

BoCov TY y\n:,DRVYLhriTLLLNGyYPTSGSTYRNMALKGTLLLSRLWFKPPFLSDFING- 

OC43 TY YVLDRVYLNITLLLNGYYPTSGSTYRNMALKGTLLLSRLWFKPPFLSDFING- 

PHEV TF YVLDRVYLNTTLLLNGYYPISGATFRNMALKGTRLLSTLWFKPPFLSPFNDQ- 

MHV TY YVLDRVYLNATLLLTGYYPVDGSNYRNLALTGTNTLSLTWFKPPFLSEFNDG- 

T0R2„S HT SSMRGVYYPDEI FRSDTIiYLTQDLPLPPYSNVTGFHTINHTFGNPVlPFKDG- 

AIBV 



229B 

PEDV -RNCIiFNKAiP AYMRDGKDIWGITWDNDRVT-VFADKIYHFYLKNDWSR 

CCov -AICLGDDRKIPFSVIFTDNGTKIFGLEWm>DyVTAYISDRSHHUTIHMNWFNI^ 

PRC r ■ 

FICV GGQCVFNQRFS LiyiVLTTNDFyGFQWTDTYVDiyLGGTITKyWVI3NIWSIV^ 

BoCov IFAKVKN TKVIKKGVMYSEFPAITIGSTFVNTSYSVWQPHTTN 

OC43 IFAKVKN TKVIKHGVMYSEPPAITIGSTFVNTSYSVWQPHTTN 

PHEV IFAKVKN SRFSKDGVIYSEFPAITIGSTFVNTSYSIWEPHTSL 

MHV IFAKVQN LKTNTPTGATSYFPTIVIGSLFGNTSYTWLEPYNN 

T0R2„S lYFAATEK SWWRGWVFGSTMNNKSQSVIIINNSTOWIRACNFELCDN 

AIBV MLGKSLFLVTILCALCSANLFDPANYVYYYQSAFRP 



229B MPVLLVAYALLHIAGCQTTNGIiN— TSYSVCNG CVGYSENVFAVES 

PEDV - - VATRCYNRRSCAMQYVYTPTYYMLNVTSAGEDG- lYYEPCTAN — CTGYAANVFATDS 

CCov RSSSATWQKSAAYVYQGVSNFTYYKLNNTNGLKS YELCEDYEYCTGYATNVFAPTV 

PRC -MKKLFWLWMPLIYGDKFPTSWSN CTD— QCASYVANVFTTQP 

FICV - 1 SYHWNRINYGYYMQFVNRTTYYAYNNTGGANYTQLQLSECHTD- YCAGYAKNVFVP- 1 

BoCov -LDNKLQGLLEISVCQYTMCEYPHTICHPKL-GNKRVELWHWDTGWSCLYKRNFTYDVN 

0C4 3 -LDNKIJQGLLEISVCQYTMCEYPNTICHPNL-GNRRVELWHWDTGVVSCLyKRNFTYDVN 

PHEV -INGNLQGIiLQISVCQYTMCEYPHTICHPNL-GNQRIELWHYDTDWSCLYRRNFTYDVN 

MHV IIMASVCTYTICQLPYTPCKPNTNGNRVIGFWHTDVKPPICLLKRNFTFNVM 

T0R2_S - - -PFFAVSKPHGTQTHTMIFDNAFNCTFEYISDAFSLDVSEKSGNFKHLREFVFKNKDG 

AIBV SNGWHLQGGAYAWNSSNYANNAGSASBCT VGVIXDVYNQSAASIAHTAPLQQ 
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FLYVmCGYQPIDVVRDLPSGFNTLKPIFKLPLGINITNFRAIIiTAFSPAQDIWGTSAAAY 
imWSKSQFCSAHCDFSEXTVFWHCYSSGS6SCPITGHIABGHIRlSAMKN6SIiFVNI«lV 
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GDCKGFSSDVLSDVIRYNLN-FEENLRRGT ILFKTSYGV-WFYCTNNT- 

GVCNGAAVDRAPEALRFNINDTSVILAEGS IVLHTAIiGTNLSPVCSNSSD 

SQCNGVSLNNTVDVIRFNLN- FTALVQSGMGATV-FSLimHSGVILEI SCYKDTVS E 

DQCNGAVLl^TVDVlRFmiN-FTTNVQSGKGATV-FSLNTTGGVTLEISCYND^ D 

AFCP NVTADVLRFNLNFSDTDVra)STNDEQIiPFTPEDOTrPASIACYSSAN^^ 

WVTPLTSKQYIiIiAFNQDGVIFNAVDCKSDFMS- - -EIKCKTLSIAPSTGVYEIiNQ 

WVTPLTSKQYLLAFNQDGVIFNAVlX:KSDFMS---EIKCRrLSIAPSTGVYELMG 

WVTPI.TTRQPr.IiAFDQDGVIiYHAVDCASDFMS— EIMCRTSSITPPTGVYEIiNG 

WVTPLLKRQYLFNFNEKGVITSAVDCASSYIS EIKCKTQSLLPSTG^7YDLSG — 

FVGYIiKPTTFMLKYDENGTITDAVDCSQNPIA ELKCSVKSFEIDKGIYQTSN 

SVSKYPNFKSFQCVimFTSVYLNGDLVFTSincrTDVTSAGVYFKAGGPVNYSIMK 
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PEDV -PHLAIFAIPIXSATEVPYYCFLKVDTYNSTVYKFl/AVLPSTVREIVITKYGDVYVNGFGY 

CCOV SSFYSYGEISFGVTD6PRYCFA LYNGTALKYLGTLPPSVKEIAISKWGHFYINGYNF 

PRC SSFSSYGEIPFG\raGPRYCYV---I»YNGTALKYLGTLPPSVKEIAISKWGHFYINGYKF 

FICV PANNSVSHIPFGKT--AHFCFAN-FSHSIVSRQFI1GILPPTVREPAFGRDGSIFVNGYKY 

BoCov -YTVQPIADVYRRIFNLPDCNIE^^Wl^NDKSVPSPIiNVmBKTFSNCNFl^S^ 

OC43 -YTVQPX7U)VYRRIPI^PIX:NIEAWLNDKSVPSPLNWERKTFSNCNFNMSSLMSFIQ 

PHBV -YTVQPVATVYRRIPDLPNCDIEAWLNSKTVSSPLNWERKIFSNCNFNMGRLMSFIQADS 

MHV -YTVQPVGWYRRVPNLPDCKIEEWLTAKSVPSPIiNWERRTFQNCNFNLS SLLRYVQAES 

T0R2_S -FRNA^SGDWRFPNITNLCPFGEVFNATKFPSVYAWERKKISNCVADYSVLYNSTFFST . 

AIBV -EFKVLAYFVNGTAQDVILCDNSPKGLIACQYmX3NFSIX5FYPFTNSTIiVREKFIVYRES 

* 

22 9B FTIiGNVEAVNFNVTTAETTD FFTVALASYADVLVMVSQTSIANIIYCNSVINRLRC \ 

PESV LHIX3LLI)AVTIYFT6HGTDDDVSGFWTIASTNFV£tfa«IEVQGTSIQRILYCDDF^ 

CCov FSTFPIDCISFNLTTGDSGA FWTIAYTSYTDALVQVENTAIKKVTYCNSHINNIKC 

PRC FSTFPIDCISFNLTTGDSDV FWTIAYTSYTEALVQVENTAITNVTYCNSYVNNIKC ' 

FICV FSLPAIRSVNFSISSVEEYG FWTIAYTNYTDVMVDVKGTAITRLFYCDSPLNRIKC 

BoCov FTCNNIDAAKIYGMCFSSIT IDKFAIPNGRKVDLQLGNLGYLQSFNYRIDTTATSC 

OC43 FTCNNIDAAKIYGMCFSSIT IDKFAIPNGRKVDliQLGNLGYLQSFNYRIDTTATSC 

PHBV FGCNNIDASRLYGMCFGSIT IDKFAIPNSRKVDLQVGKSGYLQSFNYKIDTAVSSC 

MHV LSCNNIDASKVYGMCFGSVS VDKFAIPRSRQIDLQIGNSGFLQTANYKIDTAATSC 

T0R2_S FKCYGVSATKLNDLCFSNVY ADSFWKGDDVRQIAPGQTGVIADYNYKLPDDFMGC 

AIBV SVNTTLALTNFTFTNVSNAQ PNSGGVHTFHLYQTQTAQSGYYNFNLSFLSQFVYKA 



2 2 9E DQLSFYVPDGFYSTSP- - IQSVELPVSIVSLP VYHKHMFIVLYVDPKPQ 

PEDV SQVAFDLDDGFYPISSRNLLSHEQPISFVTLP SFNDHSFVNITVSAA 

CCOV SQLTANLQNGFYPVAS— SEVGLVNKSWLLP SFYSHTSVNITIDLGMKR— 

PRC SQLTANLNNGFYPVSS - - SEVGSVNKSWLIiP SFLTHTIVNITIGLGMKR- - 

FICV OQI;KHELPDGFYSASM--LVKKDLPKTFVTMP QFYHWMNVTLHWLNDTEKK 

BoCov -QLYYNLPAANVSVSRFNPSTWNRRFGFTEQFVFKPQPVGVFTHHDWYAQHCFKAPKNF 

OC43 -QLYYNliPAi^NVSVSRFNPSIWNRRFGFTEQSVFKPQPAGVFTDHDWYAQHCFKAPTNF 

PHEV -QLYYSLPAANVSVTHYNPSSWNRRYGFNNQS FGSRGLHDAVYSQQCFNTPNTY 

MHV -QLYYSLPKNNVTINNYNPSSWNRRYGFKVND 

T0R2_S -VLAWNTRNIDATSTG NYNYKYRYLRHG — * 

AIBV SDYMYGSYHPICAFRP— -ETINSGI.WFNSLS 
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SGYGQPIASTLS — NITLPMQDNNTD VYCVRSDQFSVYVHSTCKSALWDNVFKR 

YDI ILAKAPELAALADVHFEI AQANG SVTNVTSLCVQARQLA LFYKYTSL 

— CPCKLDGSLCVGNGPGIDAGYKNS6IG TCPAGTNYLT CHNAA QCDC 

- -CPCKLDGSLCVGNGPGIDAGYKNSGIG TCPAGTNYLT CHNAV— — QCNC 

— CPCp'j;. — SQCIG 6 AGTG TCPVGTTVRK CFAAVTKATKCTC 
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DCTDVLYATAVIKTGTCPFSFDKLNNYLTFNKFCLSLNPVGAN-CKFDVAARTRTNEQW 
NCTDVLDATAVIKTGTCPFSFDKLNNYLTFNKFCLSLSPVGAN-CKFDVAARTRTNEQfVV 
QGLYTYSNLVELQNYDCPFSPQQFNNYLQFETLCFDVNPAVAG-CKWSLVHDVQWRTQFA 
LCTPDPITSKSTGPYKCPQTKYLVGIGEHCSGLAIKSDYCGGNPCTCQPQAFLGWSVDSC 
LCTPDPITSKSTGPYKCPQTKYLVGIGEHCSGLAIKSDYCGGMPCTCQPQAFLGWSVDSC 
WCQPDPSTYKGVNAWTCPQSKVSIQPGQHCPGLGLVEDDCSGNPCTCKPQAFIGWSSETC 

KIiR PFERDISN- -VPFSPDGKPCTPPALN-CYWPLNDYGFYTTTGI 

VSLTYGPLQGGYKQSVFSGKATCCYAYSYNGPRT^KGVYSGELSRDFECG 
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IG 4.TLYVSWSDGDGITGVPQ-FVEGVSSFMNVTLDKCTKYNIYDVSGVGVIRVSNDT 

LT SLYFQFTKGELITGTPK-PLEGITDVSFMTLDVCTKYTIYGFKGEGIITLTNSS 

R SLYVI YEEGDNIVGVPS -DNSGLHDLSVLHLDSCTDYNIYGITGVGI IRQTNST 

R SLYVIYEEGDSIVGVPS-DNSGLHDLSVLHLDSCTDYNIYGRTGVGIIRQTNRT 

T ITVSYKHGSMITTHAKGHSWGFQDTSVLVKDECTDYNryGFQGTGIIRNTTSR 

LQGDRCNIFANFIFHDVNSGTTC'STDLQXSNTDIILGVCVNyDLyGITGQGIFVEVNAT 
LQGDRCNIFANFILHDVNSGTTC-STDLQKSNTDlILGVCVNYDIiYGITGQGIFVEVNAP 
LQNGRCNIFANFILNDVNSGTTC-STDLQQGNTIITTDVCVNYDLYGITGQGILIEVNAT 

RCQIFANILLNGINSGTTC-STDLQLPNTEVATGVCVRYDLYGITGQGVFKEVKAD 

Q YQPYRVVVLSFELLNAPA-TVCGPKLSTDLIKNQCVNFNFNGLTGTGVLTPSSKR 

1, LVYVTKSDGSRIQTRTEPLVLTQHNYNNITLDKCVAYNIYGRVGQGFITNVTDS 
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1,1.8—.' GLYYTSLSGDLLGFKNVSDGVIYSVTPCDV SAQAAVIDGTIVGAI 
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YYWS -WQNLLYDSSGNLYGFRDYLSNRTFLIRSCYSG — RVSAVFHANS SEPAL 

YYNS WQALLYDVNGNLNGFRDLTTNKTYTIRSC YSG— RVS AAYHKEAPEPAL 

FQP FQQFGRDVSDFTDSVRDPKTSEILDISPCAFGGVSVITPGTNASSEVAV 

VANFSYLADGGLAILDTSGAIDVFWQGSYGLNYYKVNPCEDVN—QQFWSGGNIVGIL 
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LFRNIKCN YVFNNTLSRQLQPINYFDSYLGCWNADN . STS 

LFRNIKCN YVFNNTLSRQLQP INYFDS YLGCWNADN STA 

MFRNLKCS HVFNNTILRQIQLVNYFDSYLGCWNAYN NTA 

LYRN INCS— YVFTNNI SREENPLNYFDSYLGC\A7NADN RTD 

LYQDVNCT DVSTAIHADQLTPAWRIYSTGNNVFQTQAGCLIGAEHV 

TSRNETGS E-QVENQFYVKLTNSSHRRRRS IG 
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PRC VGCEPVITYSNIGVCKN6ALVFI NVTHSDGDVQPISTGNVT 

FICV -NCTSAITYSSFAICNTGEIKYVNVTHVEIVDDSIGVIKPVSTGNIS 

BoCov SWQTCDLTVGSGYCVDYSTKRRSR>RAITT6YRFTNF£PFTVNSVNDSL£PVGGLYEIQ 
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EALPNCNUlMGAGIiCVDYSKSRRAR-RSVSTGYRLTTFEPYMPMLVNDSVQSVGGLYEMQ 

DTSYECDIPIGAGICASYHTVSIiLRSTSQKSIVAYTMSLGADSSIAYSNN TIA 

QIWTSCPYVSYGRFCIEPDGSLKMI-^ VPEELKQFVAPLLNITES VL 
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BoCov 

0C43 

PHEV 

MHV 

T0R2_S 

AIBV 



IPSI^ISVQVEYLQITSTPIVVIX:STYVCNGNVRCVELIiKQYTSACKTIEDALRNSARL 

IPTNFSMSIRTEYLQLYNTPVSVDCATYVCNGNSRCKQLLTQYTAACKTIESALQLSARL 

IPTNPTISVQVEYlQVYTTPVSIDCSRYVCNGNPRCNKLIiTQYVSACWriEQAL^ 

IPTNPTISVQVEYIQVYTTPVSIIXSRYVaiGNPRCNKLLTQYVSACCrriEQAL^^ 

IPKNPTVAVQAEYIQIQVKPVVVmCATYVCNGNTHCLKLLTQYTSACCn'IENAIJ^^ 

IPSEFXIGNMEBFIQTSSPKVT1DCSAFVC6DYAACKSQLVEYGSFCDNINAILTEVNEL 

IPSEFTIGNHEEFlQTSSPKVTIIXrSAFVCGDCAACKSQLVEYGSFCDNINAIItTEN/NBL 

IPSEFTIGNLEEFIQTRSPKVTIDCATFVCGDYAACRQQLAEYGSFCENINAILTEVNEL 

IPTNPTIGHHEEFIQIRAPiWTIDCAAFVCGDNAACRQQLVEYGSFCDNVNAILNEVNNL 

IPTNFS I SITTEVMPVSMAKTSVDCNMY1C6DSTECANLLLQYGSFCTQLNRALSG lAAE 

IPNSPNLTVTDEYIQTRMDKVQINCLQYVCGNSLECRKLFQQYGPVCDNILSWNSVSQK 
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PEDV 

CCOV 
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ESADVSEMLTFDKKAFTliANVSSF-GD YNLSSVIPS LPTSGSR — 

ESVEVNSMIjTISEEALQLATISSFNGDG YNFTNVLQASVY DPASORV — 

ENMEXDSMLFVSENALKLASVEAFNSTETLDFIYKEWFNIGGSWIjGGLKDILPSI^ 
ENMEVDSMLFVSENALKIiASVEAFNSSETLDPIYTQWPNIGGFWLEGIOTILPSDMSK — 

ESLMLNDMITVSDRGLEIiATVERFNATA L6GEKLGGLYFD6 LSSLLPPK— 

LDTTQLQVANSLMNGVTLSTKLKDGVN PNVDDINFSPVLG CLGSACNK — 

liDTTQLQVANSLMNGVTLSTKLKDGVN FNVDDVNFSPVLG CLGSECNK — 

LDTTQLQVANSLMNGVTLSTKIKDGIN FNVDDINFSPVIiG--'-CLGSECNR— 

LDNMQLQVASALMQGVTISSRLPDGIS GPIDDINFSPLLG CIGSTCAEDG 

QDRNTREVFAQVKQMYKTPTLKYFGGP— NFSQILPDPLKP 

EDMELLSFYSSTKPKGYDTPVLSNVSTG EPNISLLIiTPPSSP — 

• • • • • 
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VAGRSAIEDILFSKIVTSGI/STVDADYKNCTKGLS- -1ADIACAQYYNGIMVLP0 

VQKRSVJEDIiIJT«KVVTNGLGTVDBI3YKRCSNGRS — VADLVCAQYYSGVMVLPG 

RKYRSAIEDLLPDKWTSGLGTVDEDYKRCTGGYD- - lADLVCAQYYNGIMVLPG 

RKYRSAIEDLLFSKVVTSGLGTVDEDYKRCTGGYD— lADLVCAQYYNGIMVLPG 

IGKRSAVEDLLFNKWTSGLGTVDDDYKKCS SGTD- - VADLVCAQYYNGIMVLPG 

VSSRSAIEDLLFSKVKLSDVG -FVEAYNNCTGGAE— IRDLICVQS YNGIKVLPP 

VSSRSAIEDLLFSKVRLSDVG-FVEAYNNCTGGAG— IRDLICVQSYNGIKVLPP 

ASTRSAIEDLLFDKVKLSDVG-FVQAYNNCTGGAE— XRDLICVQSYNGIKVLPP. 

NGPSAIRGRSAIEDLLFDKVKLSDVG-FVEAYNNCTGGQE — VRDLLCVQSFNGIKVLPP 

TKRSFI EDLLFNKVTLADAG -FMKQYGECLGDIN — ARDLICAQKFNGLTVLPP 

SGRSFVEDLLFTSVETVGLP-TDAEYKKCTAGPLGTLKDLICAREYNGLLVLPP 

.*«.** .: . ♦ .* . *♦ :.*: *** 
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VADAERMAMYTGSLIGGIALGGLT SAVSIPFSLAIQARLMYVALQTDVLQENQKIL 

WDAEKLHMYSASLIGGMALGGIT AAAALPPSYAVQARIiNYIiALQTDVLQRNQQLL 

VANDDKMAMYTASLAGGITLGSLGG- - - GAVSIPFAIAVQARLNYVALQTDVLNKNQQIL 

VANADKMTMYTASLAGGITLGAFGG GAVSIPFAVAVQARLNYVALQTDVLNKNQQIL 

WDGNKMSMYTASLIGGMALGSIT SAVAVPFAMQVQARLNYVALQTDVLQENQKIL 

LLSVNQISGYTLAATSASLFPPLS AAVGVPFYLNVQYRINGIGVTMDVLSQNQKLI 

LLSDNQI SGYTLAATSANLFPPWS AAAGVPFYLNVQYRINGIGVTMDVLSQNQKLI 

LLSENQISGYTLAATAASLFPPWT AAAGVPFYLNVQYRINGLGVTMDVLSQNQKLI 

VLSESQISGYTAGATAAAMFPPWT AAA6VPFSLNVQYRINGLGVTMEIVLSENQKMI 

LLTDDMIAAYTAALVS6TATAGWTFGAGAALQIPFAMQMAYRFNGI6VTQNVLYENQKQ1 
IITADMQTMYTA5LVGAMAFGGIT >SAAAIPFATQIQARINHL6IAQSLLMKNQEXI 
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AASFNKAMTNIVDAFTGVNDAITQTSQALQTVATALNKIQDVVNQQGNSLNHLTSQLRQN 
AESFKSAIGNITSAFESVKEAISQTSKGLNTVAHALTKVQEWNSQGSALNQLTVQLQHN 
ANAFNQAIGNITQAFGKVITOAIHQTSQGLAWAKVLAKVQDVVNTQGQALSHLTLQLQNN 
ASAFNQAIGNITQSFGKVNDAIHQTSRGLTTVAKALAKVQDWNTQGQALRHLTVQLQNN 
ANAFKmiGNITLALGKVSl^XTTTSDGPNSMAS/VLTKIQSVVNQQGEALSQLTSQLQKN 

ANAFNNALDAIQEGFDATN S - ALVKIQAWNANAEALNNLLQQLSNR 

ANAFNNALDAIQEGFDATN S- ALVKIQAWNADAEALNNLLQQLSNR 

ASAFNNALDAIQEGFDATN S-ALVKIQAWNANAEALNNLLQQLSNR 

ASAFKKALGAIQEGFDATN S-ALGKIQSWNANAEALtJNLLNQLSHR 

ANQFNKAISQIQESLTTTS * TALGKLQDWNQNAQALNTLVKQLSSN 

AASFNKAIGHMQEGFRSTS LALQQVQDWNKQSAILTETMNSLNKM 

****••• -* !•* S- * .* . 
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229E 

PEEV 

CCov 

PRC 

FICV 

BOCOV 

OC43 

PHEV 

MHV 

T0R2_jS 

AIBV . > 



FQAI SSS I QAIYDRLDTIQADQQVDRLITGRlJJUiNVFVSHTLTKYTEVRASRQLAQQKV 

FQAISSSIDDIYSRIJ^ILLArVOVDRLITGRLSAIJaAFVACm'TKYTEVQASIUC^ 

FQAI SSS I SDIYNRLDELSADAQVDRLITGRLTALNAFVSQTLTRQAEVRASRQLAKDKV 

FQAI SSS I SDIYNRLDELSADAQVDRLITGRLTALNAFVSQTLTRQAEVRASRQLAKDKV 

FQAlSSSIAEIYinUiEKVEADAQVDRLITGRLAALNAWSQTLTQYAEVKASRQIALEKV 

FGAISSSLQEILSRLDALEAQAQIDRLINGRLTALNVYVSQQLSDSTLVKFSAAQAMERyr 

FGAIf^SLQEILSRLDALEAQAQIDRLINGKLTALDAYVSQQLSDSTLVKFSAAQAMEKV 

FGAISASLQEXIiSRLDALEAKAQIDRLINGRLTAliNAYVSQQLSDSTLVKFSAAQAlEKV 

FGAISASI^EILTRIiDAVEAKAOIDRLINGKLTALNAyiSKQLSDSTLXKFSAAQAIEinr 

FGAISSVLNDILSRLDKSmAEVQIDRLITGRliQSLQTYVTQQLIRAAEIRASANI^ 

FGAISSVIQDIYAQLDAIQADAQVDRLXTGRLSSLSVIiASAKQSEVIRVSQQRELATQKI 
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NECVKSQSKRYGFCG-NGTHIFSI\7NAAPEGLVFLHTVLLPTQYKDVEAWSGLCV-I)G — 
NECVKSQSQRYGFCGGDGEHIFSLVQAAPQGLLFIjHTVLVPGDFVNVLAIAGLCV-NG — 
NECVRSQSQRPGFCG-NGTHLFSLANAAPNGMIFFHTVLLPTAYETVTAWSGICASDGDR 
NECVRSQSQRFGFCG-NGTHLFSLANAAPNGMIFFHTVOjLPTAYETVTAWSGICALDGDR 
NECVKSQSNRYGFCG-NGTHLFSLWSAPEGLIiFFHTWjLPTEWEEVrAWSGICVN^ — 
NECVKSQSSRINFCG-NGNHIISLVQNAPYGLYFIHFSYVPTKYVTAKVSPGLCI 

necvksqssrinfcg-ngnhiislvqnapyglyfihfsyvptkyvtakvspglci 

necvksqssrinfcg-ngnhiislvqnapyglyfihfsyvptkyvtakvspgu:! 

necvksqttrinfcg-ngnhilslvqnapyglcfihfsyvptsfktanvspglci 

SECVLGQSKRVDFCG-KGYHLMSFPQAAPHGVVFIiHVTYVPSQERNFTTAPAICH 

NECVKSQSNRYGFCG--SGRHV]jSIPQNAPNGIVFIHFTYTPETFVNVTAIVGFCVNPIiIIA 
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TNGYVLRQPNLALYK EGNYYRITSRIMFEPRIPTMADFVQIENCNVTFVNISRS 

EIALTLREPGLVLFTHELQTYTATEYFVSSRRMFEPRKPTVSDFVQIESCWTYVNLTSD 

TFGLWKDVQLTLFRN LDDKFYLTPRTMYQPIVATSSDFVQIEGCDVLFVNATVl 

TPGLWKDVQLTLFRN LDDKFYLTPRTMYQPRVATSSDFVQIEGCDVLFVNTTVS 

-YAYVIiKDFDHSXFS YNGTYMVTPRNMPQPRKPQMSDFVQITSCEVTFIiNMTYT 

-AGDRGIAPKSGYFVN VNimJMFTGSGYYYPEPlTGNNVVVMSTCAVNyTKAPDV 

-AGDRGIAPKS6YFVN VNNTWMFTGSRYYYPEPITGNIJVVVMSTCAVNYTKAPDV 

-AGDIGISPKSGYFIN VNNSWMFTGSSYYYPEPITQNNVVVMSTCAVNYTKAPDL 

-SGDRGLAPKAGYFVQ DNGEWKFTGSNYYYPEPITDKNSVAMISCAVNYTKAPEV 

-EGKAYFPREGVFVFN GTSWFITQRNFFSPQIITTDNTFVSGNCDWIGIINNT 

SQYAIVPANGRGIFIQ VNGTYYITSRDMYMPRDITAGDIVTLTSCQANYVNVNKT 



229E ELQTIVP-EYIDVNKTLQELSYKL-PNYTVPDLV VEQYNQTIUOiTSEISTLENKSA 

PEDV QLPDVXF-DYrDVNKTLDEILASL-PNRTGPSLP LDVFNATYLNLTGEIADLBQRSE 

CCOV DLPSIIP-DYIDINQTVQDILENFRPNWTVPELP LDIFNATYLNLTGEINDLEFRSE 

PRC DLPSIIP-DXIDINQTVQDILENFRPNWTVPELT LDVFNATYLNLTGEIDDLEFRSE 

FICV TFQEIVI-DYIDINKTIADMLEQYNPNYTTPELNL-LLDIFNQTKIjNLTAEIDQLEQRAD 

BoCov MLNISTP-NLHDFKEELDQWFKNQ— TSVAPDLSL-DY — INVTFLDLQDEMN 

Cx:43 MLNISTP-NLPDFKEELDQWFKNQ— TLVAPDLSL-DY — INVTFLDIiQDEMN 

PHEV MLNTSTP-NLPDFKEELYQWFKNQ— SSVAPDLSL-DY — INVTFLDLQDEMN 

MHV FLNNSIP-NLPDFKEELDKWFKNQ- - TSIAPDLSL-DFEKLNVTFLDLTYEMN 

T0R2_S VYDPLQP-ELDSFKEELDKYFKNH TSPDVDLGDISGINASWNIQKEID 

AIBV^ > VITTFVEDDDFNFDDELSKWWNDT — KHGLPDFD DFNYTVPILNISGEID 

* • ••••• • 

229E ELNYTVQKLQTLIDNINSTLVDLKWLNRVETYIKWPWWVWLCISVVLIFVVSMLLL^^ 

PEDV SLRim'EELRSLINNINNTLVDLEWLNRVETYIKWPWWVVinillVIVIiIFVVSLLVFCCI^ 

CCov KLHNTTVELAILIDNINNTLVNLEWLNRIETYVKimmWLLIGLV^ 

PRC KLHNTTVELAILIDNINNTLVNLEWLNRI ETYVKWPWYWJLLIGLWIFCI PLLLFCCCS 

FICV OTiTTIAHELQQYIDNLNKTLVDLDWLNRIETYVKWPWYVWLLIGLVVVFCIPLLLFCCLS 

BoCov RLQEAIKVLNQSYINLKDIGTYEYYVKWPWYVWLLIGFAGVAMLVLLFFICCC 

OC43 ^RLQEAIKVLNQSYINLKDIGTYEYYVKWPWYVWLLIGFAGVAMLVLLFFICCC 

PHEV RLQEAIKVLNQSYINLKDIGTYEYYVKWPWYVWLLIGLAGVAMLVLLFFICCC 

MHV RIQDMKKLNESYINLKEVGTYEMYVKWPWYVWLLIGLAGVAVCVLLFFICCC 

TOR2_S - — RLNEVAKNLNESLIDLQELGKYEQYIKWPWYVWLGFIAGLXAIVMVTIIiIiCCM 

AIBV"^ > NIQ6V1QGLNDSLINLEELSIIKTY1KWPWYVWLAI6FAIIIFILILGWVFFM 
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229E TGCCG-FFSCFASSIRGCCESTKL-PYYDVEKIHIQ 

PEDV TGCCG-CCGCCGACFSGCCRGPRLQPYEAFEKVHVQ 

CCOV TGCCG-CIGCLGSCCHSICSRRQFESYEPIEKVHVH 

PRC TGCCG-CIGCLGSCCHSIFSRRQFENYEPIEKVHVH 

FICV TGFCG-CFGCVGSCCHSLCSRRQFETYEPIEKVHIH 

BoCov TGCGTSCFKICGGCCD-DYTGHQEIjVIX TSHDD 

OC43 T6CGTSCFKKCGGCCD-DYTGHQELVIK TSHBG 

PHBV TGCGTSCFKKCGGCCD-DYTGHQEFVIX TSHDD 

MRV TGCGSCCFRKCGSCCD-EYGGHQDSIVIHNISAHED 

T0R2_S TSCCSCLKGACSCGSCCKFDEDDSEPVLKGVKLHYT 



AIBV > TGCCGCCCGCFGIIPLISKCGKKSSYYTTFraiDWTEQYRPKKSV 

* 

• • • 

Key Name 

229B spike glycoprotein [Human coronavirus 229E] . 
AIBV spike glycoprotein [Avian infectious bronchitis virus] . 
BoCov E2 glycoprotein precursor (Spike glycoprotein) 
CCoV spike protein - canine coronavirus 

FICV peplomer protein [Feline infectious peritonitis virus] . 
HRV E2 glycoprotein precursor (Spike glycoprotein) 
OC43 ■ surface protein - human coronavirus 
PEDV spike protein (Porcine epidemic diarrhea virus). 
PHBV spike glycoprotein [porcine hemagglutinating encephalont/elitis 
PRC S protein [Porcine respiratory coronavirus} . 
T0R2.S Sars associated virus S glycoprotein (SBQ ID HO: -33) 



Genban]c % ID* 
AAK32m 26.6% 
AA034396 27.6% 
P25193 30.5% 
S41453 26.1% 
BAA06805 25.4% 
P11225 31.9% 
S44241 30.7% 
CAA80971 26.0% 
virus] AAL80031 30.5% 
AAA46905 27.5% 



(SBQ ID NOs 53) 

(S£Q ID nO: 54) 

(SBQ ID KO: 55) 

(SEQ ID NO: 56) 

(SEQ ID NO: 57) 

(SBQ ID NO: 56) 

(SBQ ID NO: 59) 

(SBQ ID NO: 60) 

(SBQ ID not 61) 

(SBQ ID NO: 62) 
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10 20 30 40 t 50 

T0R2_3 jiYSFVSEETGTLIWSVXiFIAFVVFLLVTIiAILTALIUXIAYCCNIVW 

PGV inTPRALTVIDDNG-MVINIIFWPLLIIILILLSIALLNIIKUaiVCCN^ 

10 20 30 40 50 

60 70 
T0R2 E YVYSRVKNLNSSEQVPDLLV (SEQ ID NO: 35) 

' • * • 

PGV HAYDAYKNFMRIKAVNPDGALIA (SEQ ID NO: 63) 
60 70 80 ■ 
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MESLVLGVNEKTHVQLSLPVLQVRIWiiVRGPGDSVEEALSEAREHLKMOT 
CGLVELEKGVLPQLEQPyVFIKRSnALSTNHGHKSA^LVAEMDGIQYGRS 
GITIX3\ai\^HVGETPlAYPNVLIJWNGNKGAGGHSyGIDIJCSYDIXSDEM 
TDPIEDyEQNWNTKHGSGALRELTREIiNGGAVTRYVDNNFCGPDGYPLDC 
IKDFLARAGKSMCTLSEQLDYIESKRGVYCCRDHEHEIAWFTERSDKSYE 
HQTPFEIKSAKKFDTFKGECPKFVFPLNSIO/KVIQPRVEKKKTEGFMGRI 

RSVYPVASPQECNimLSTLMKCNHCDEVSWQTCDFLKATCEHCGTENLV ' 

lEGPTTCGYIiPTNAWKMPCPACQDPEIGPEHSVADYHNHSNIETRLRKG 

GRTRCFGGCVFAYVGCYNKRAYWVPRASADIGSGHTGITGDNVETLNEDIi 

LEILSRERVNINIVGDFHLNEEVAIILASFSASTSAFIDTIKSLDYKSFK | 

TIVESCGNYKVTKGKPVKGAWNIGQQRSVLTPLCGFPSQAAGVIRSIFAR ' 

TLimAmiSIPDLQRAAVTILDGISBQSLRLVnAMVYTSDLLTNSVlIMAY 

VTGGLVQQTSQWLSNLU3TTVEKIjRPIFEWIEAKLSAGVEFLXI)AWEIIiK 

FLlTGVFDIVKGQIQVASITOKIXnTKCFlDVVNKALEMCIDQVT 

RSLNLGEVFIAQSKGLYRQCIRGKEQLQLIMPLKAPKEVTFLEGDSHOTV 

LTSEEWLKNGEIjEALETPVDSFTNGAIVGTPVCVNGLMLLEIKDKEQYC 

ALSPGLIATNNVFRLKGGAPIKGVTFGEDTVWEVQGYKNVRITFELDERV 

DKVLNEKCSVYTVESGTEVTEFACWAEAVVKTLQFVSDLLTNMGIDLDB 

WSVATFYLFDDAGEENFSSRMYCSFYPPDEEEEDDAECEEEEIDETCEHB 

YGTEDDYQGLPLEFGASAETVRVEEEEEEDWLDDTTEQSEIEPEPEPTPB 

EPVNQFTGYLKLTDNVAIKCVDIVKEAQSANPMVIVNAANIHIiKHGGGVA 

GALJflCATNGAMQKESDDYIKIiNGPLTVGGSCUJSGHNLAKKCLHWGPNL 

NAGEDIQLLKAAYENFNSQDILLAPLLSAGIFGAKPLQSLQVCVQTVRTQ 

VYlAVNDKALYBQV\mi3YIinNl*KPRVEAPKQEEPPNTEDSKTEEKSVVQK 

PVD\TOKIKACIDE\m?TLEETKFLTNKLLLFADINGKLYHDSQNMLR6B 

DMSFLEKDAPYMVGDVITSGDITCVVIPSKKAGGTTEMLSRAIiKKVPVDB 

YITTYPGQGCAGYTLEEAKTALKKCKSAFYVLPSEAPNAKEEILGTVSWM 

LREMLAHAEETRKLMPICMDVRAIMATIQRKYKGIKIQEGIVDYGVRFFF 

YTSKEPVASI ITKLNSLNEPLVTMPIGYVTHGFNLEEAARCMRSLKAPAV 

VSVSSPDAVTTYNGYLTSSSKTSEBHFVETVSLAGSYRDWSYSGQRTELG 

VEFLKRGDKIWHTLESPVEFHLDGEVLSLDKLKSLLSLREVKTIKVFTT 

VDim^jHTQLVDMSMTYGQQFGPTYIjDGADVTKIKPHVNHEGKTFFVIiPS 

DDTLFLSEAFEYYHTLDESFLGRYMSAL^raTKKWKFPQVGGIiTSIKWADNM 

CYLSSVLLALQQLEVKFNAPALQEAYYRARAGDAANFCALIIiAYSNKTVO 

EIiGIWRETMTHLLQHANLESAKR\ajaWCKHCGQKTTTLTGVEAVMVM^ 

LSYDNLKTGVSIPCVCGRnATQYliVQQESSFVMMSAPPAEYKLQQGTPW: 
ANEYTGNYQCGHYTHITAKBTLYRIDGAHI»TKMSEYKGP\m>VFYKETSY 

TTTIKPVSYKLDGVTYTEIEPKLDGYYKKDNAYYTEQPIDLVPTQPLPNA 

SFDNFKLTCSNTKFADDLNQMTGFTKPASRELSVTFFPDLNGDWAIDYR 

HYSASFKKGAKLLHKP rVWHINQATTKTTFKPKlWCLRCLWSTKPVDTSN 

SFEVLAVEDTQGMDNLACESQQPTSEEVVENPTIQKEVIECDVKTTEVVG 

NNaLKPSDEGVKVTQELGHEDLMAAYVENTSITIKKPNELSLALGLKTIA 

THGIAAINSVPWSKILAYVKPFLGOAAITTSNCAKRLAQRVFNNYMPYVP 

TLLFQLCTFTKSTNSRIRASLPTTIAKNSVKSVAKLCLDAGINyVKSPKP 

SKLFTIAMWLLLLSICLGSLICVTAAFGVLLSNFGAPSYCNGVRELYiaiS 

SNVTTMDFCEGSFPCSICLSGLDSLDSYPALETIQVTISSYKLDLTILGL 

AAEWV^YmjFTKFFyLLGLSAIMQWFGYFASHFISNSWLMWFIISIVQ 

MAPVSAMVRMYIFFASFYYIWKSYVHIMDGCTSSTCMMCYKRNRATRVEC 

TTIVNGMKRSFYVYANGGRGFCKTHNWNCLNCDTFCTGSTFISDEVARDL 

SLQFKRPINPTDQSSYIVDSVAVKNGALHLYFDKAGQKTYERHPLSHFVN 

liDNLRANNTKGSLPINVIVFDGKSKCDESASKSASVYYSQLMCQPILLLD 

QALVSDVGDSTEVSVKMFDAYVDTFSATFSVPMEKLKALVATAHSELAKG 

VAIiIXJVLSTFVSAARQGVVI>TDVDTKDVIECLKLSHHSDLEVTGDSCN^ 

MLTYNKVEl^PRDIXSACIIXmRHINAQVAKSHNVSLIVnWKDYMS 

QLRKQIRSAAKKNNIPFRLTCATTRQVVNVITTKISLKGGKIVSTCFKLM 

LKATLLCVLAALVCYIVMPVHTLSIHDGYTNBIIGYKAIQDGVTRDIIST 

DDCFANKHAGFDAWFSQRGGSYKNDKSCPWAAIITREIGPIVPGLPGTV 

LRAINGDFLHFLPRVFSAVGNICYTPSKLIEYSDFATSACVLAAECTIFK 

DiAMGKPVPYCYDTNLLBGSISySELRPDTRYVLMDGSIIQFPNTYLBGW 

RWTTFDAEYCRHGTCERSEVGICLSTSGRWVLNNEHYRALSGVFCGVDA 

MNLIANIFTPbVQPVGAIiDVSASWAGGIIAILVTCAAYYFMKFRRVFGE 

YNHWAANALIiFLMSFTILCLVPAYSFLPGVYSVFYLYLTFYPTNDVSFL 

AHLQWFAMFSPIVPFWITAIYVFCISLKHCHWFFNNYLRKRVMFN6VTFS 

TFEEAALCTFLLNKEMYLKIiRSETLLPLTQYNRYIiALYNKYKYFSGALOT 

TSYREAACCHIiAKALNDFSNSGADVLYQPPQTSITSAVLQSGFRKMAFPS 

GKVEGCMVQVTCGTTTLNGLWLDDTVYC PRHVICTAEDMLNPNYEDLLIR 

KSNHSFLVQAGNVQLRVIGHSMQNCLLRLKVDTSNPKTPKYKFVRIQPGQ 

TPSVLACYNGSPSGVYQCAMRPNHTIKGSFLNGSCGSVGFNIDYDCVSFC 

YMHHMELPTGVHAGTDLEGKFYGPFVDRQTAQAAGTDTTITIiNVIiAWLYA 

AVINGDRWFLNRFTTTLNDFNLVAMKYNYEPLTQDHVDIIX3PLSAQTGZA 

VLDHCAALKELLQNGMNGRTILGSTILEDEFTPFDWRQCSGVTFQGKFK 
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KIVKGTHRWMLLTFLTSLLILVQSTQWSLFFFVYENAFIiPFTLGIMAIAA 
CAMLLVKHKHAFLCLFLLPSIiATVAYFNMVmiPASWVHRIMTWLEI^^ 
LS6YRLKIX:VMYASAIA?LI.IIiMTARTVyDDAARRVWTIJlNVITLV^ 
GNALDQAISMWMiVISVTSNYSGVVTTIMPli^^ 

ntik^cimlwcflgyccccyfglfcllnryfrltlgvydylvstqefrvm 
nsqgllppkssidafklnikllgiggkpcikvatvqskmsdvkctswiil 
svlqqlrvesssklwaqcvqlhm)iijjucdtteafe3aivsllsvllsmqg 
avdinrlceemldnratlq;^:^;^sefsslpsyaayataqeayeqavangds 
e\aa*kklkkslwaksefdrdaamqrklekmadqamtqmykqarsedkra 

K\n'SAHCm<LFrimiRKLDNDAU9NIINNARD^ 

VPDYGTYKNTCIXSNTFTyASAIAffilCKJfWnADSKIVQLSEIlM 

WPLIVTAIJUUJSAViaQNNELSFVALRQMSCAAGTTQTACTDDIVUJ^Yy^ 

NSKGGRFVLALLSDHQDLKMARFPKSD6TGTIYTELEPPCRFVTDTPKGP 

KVKYLYFIKGLNNLNR6MVL6SLAATVRIiOAGNATE\^ANSTVLSFCAFA 

VDPAKAYKDYIASGGQPITNCVKMLCTHTGTGQAIT\rrPEAimDQESF^ 

ASCCLYCRCHIDHFNPKGFCDIJCGKYVQIPTTCANDPVGFTLWJTVCTVC 

GMWKGYGCSCDQLREPLMQSADASTP 

• 

(SEQ ID NO: 64) 
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FKRVCG 

VSAARLTPCGTGTSTDWyRAFDIYNEKVAGFAKPLKTNCCRFQEKDEBO 

NLLDSYFWKRHTMSNYQHEETIYNLWXJPAVAVHDFFK^ 

ISRQRLTKYTMADLVYALRHFDEGNCDTLKEILVTYNCCTDDYFNKKDWy 

DFVENPDIIiRVYANLGERWQSLLKWQFCDAMIUDAGIVGVLTLDNQDL^ 

GNWYDFGDrVQVAPGCGVPIVDSYYSIiLMPILTLTRAIiAAESHMDADLAK 

PLIKWDLLKYDFTEERIiCIiFDRYFKYWDQTYHPNCINCLDDRCILHCANF 

NVLFSTVFPPTSFGPLVRKIFVIXSfVPFWSTGYHFRELGVVHNQDVNLHS 

SRLSFKELLWAADPAMHAASGNLLIlDKRTTCPSVAALT^DWAFQTVKPG 

NFNKDFYDFAVSKGFFKEGSSVELKHFFFAQDGNAAISDYDyVRYNLPTM 

CDIRQLLFWEWDKYFDCYDGGC INT^QVIVNNLDKSAGFPFNKWGKAR 

LYYDSMSYEDQDALFAYTKRNVIPTITQMNLKyAISAKNRARTVAGVSIC 

STmTJRQFHQKLLKSIAATRGATWIGTSKFYGGWHNMLKTVYSEVETPH 

LMGWDYPKCDRAMFimLRIMASLVIJaUaSNTCCNLSHRFYRIJ^MECAQ 

SEM\mCGGSLY\TOG6TSSGDATTAyANSVFNICQAVTANVNAIiLSTDGN 

KIADKY\mmjQHRLYECLyRinU3VDHEFVDEFYAYIjRKHFSMMILSDDAV 

VCYNSNYAAQGLVASIKNFKAVLYYQNNVFMSEAKCWTETDLTKGPHEFC 

SQHTMLVKQGDDYVYIiPYPDPSRILGAGCFVDDIVKTDGTIiMIERFVSLA 

IDAYPLTKHPNQEYADVFHLYLQYIRKLHDELTGHMIJJMYSVMLTNDNTS 

RYWEPEFYEAMYTPHTVLQAVGACVLCNSQTSLRCGACIRRPFIiCCKCCY 

DHVISTSHKLVLSVNPYVCNAPGCDVTDVTQIiYLGGMSYYCKSHKPPISF 

PLCANGQWGLYKNTCVGSDNVTDFNAIATCDWTNAGDYILANTCTERLK 

LFAAETLKATEETFKIiSYGIATVREVliSDRELHLSWEVGKPRPPLNRNYV 

FTGYRWKNSKVQIGEYTFEKGDyGDAVVYRGTTTYKLNVGDYFVIjTSHT 

VMPLSAPTLVPQEHYVRITGLyPTLNISDEFSSNVANYQKVGMQKySTLQ 

GPPGTGKSHFAIGLALYyPSARtVYTACSHAAVDALCEKALKYIiPIDKCS 

RI I PARARVBCFDKFKVNSTLEQYVPCTVNAIiPETTADI\A;FDEI SMArm 

ydlsvvnariju^khyvyigdpaqlpaprtlltkgtlepeyfnsvcrlmkt 
igpdmflgtcrrcpaeivdtvsalvydnklkahkdksaqcfkmfykgvit 
hdvssainrpqigwrefltrnpawrkavfispynsqnavaskilglptq 
tvdssqgseydyviftqttetahscn\7nrfnvaitrakigiiicimsdrdii 
ydklqftsleiprrnvatlqaenvtglfkdcskiitglhptqapthlsvd 

IKFKTEGLCVDI PG I PKDMTYRRL I SMMGFKMNYQVNGY PNMPITREEAI 
RHVRAWIGFDVEGCHATRDAVGTNLPLQLGFSTGVNLVAVPTGYVDTENN 
TEFTRVNAKPPPGDQFKHLIPIiMYKGIiPWNVVRIKIVQMLSDTLKGLSDR 
WFVLWAHGFELTSMKYFVKIGPERTCCLCDKRATCFSTSSDTYACWNHS 

vgfdyvynpfmiiwqqwgftgotiqsnhdqhcqvhgnahvascnaimtrcl 
avhecfvkrvdwsveypiigdei*rvnsacrkvqhm\a^alladkfpvlh 

DIGNPKAIKCVPQAEVEWKFYDAQPCSDKAYKIEELFYSYATHHDKFTDG 

VCLFWNCNVDRYPANAIVCRFDTRVLSNIjNLPGCDGGSLYVNKHAFHTPA 

FDKSAFTNLKQLPFFYYSDSPCESHGKQWSDIDYVPLKSATCITRCNIjG 

GAVCRHHANEYRQYLDAYNMMI SAGFSLWiyKQFDTyi«iWNTFTRLQSLE 

NVAYNVVNKGHFDGHAGEAPVSIII^AVYTKVDGIDVEIFENKTTLPVNV 

AFELWAKRNIKPVPEIKILNmiGVDIAANTVIWDYKREAPAHVSTIGVCT 

MTDIAKKPTESACSSLTVLFDGRVEGQVDLFRNARNGVLITEGSVKGIiTP 

SKGPAQASVNGVTLIGESVKTQFim*KICVDGIIQQLPETYFTQSRDLEDP 

KPRSQMETDFLELAMDEFIQRYKLEGYAFEHIVYGDFSHGQLGGLHLMIG 

IJUaiSQDSPIJCLEDFIPMDSTVnCNYFITDAQTGSSKCVCSVIDIiIiLra 

EIIKSQDLSVISKVVKVTIDYAEISFMLWCKDGHVETFYPKLQASQAWQP 

gvampnlykmqrmllekcdlqnygenavipkgimmnvakytqlcqylntl 
tlavpynmrvihfgagsdkgvapgtavlrqwlptgtllvdsdlndfvsda 
dstligdcatvhtankwdli i sdmydprtkhvtkendskegfftylcgfi 
kqkxju^ggsiavkitehswnadlyklmghfswwtafvtnvnassseafli 
ganylgkpkeqidgytmhanyifwrntnpiqlssyslfdmskfplklrgt 

AVMSLKENQINDMIYSLLEKGRLIIRENNRVWSSDILVNN 
(SEQ id NO: 65} 
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MDLFMRFFTLRSITAQPVKIDNASPASTVHATATIPLQASLPFGWIiVIGV 
AFLAWQSATKIIAUSFKRWQLALYKGFQFICNLLLLFVTIYSHLLLVAAG 
MEAQFLYLYALIYFLQCINACRIimCWnLjCWKCKSKNPLLyDANyFVCWH 
THNYDYCIPYNSVTDTIWTEGDGISTPKLKEDYQIGGYSEDRHSGVKDY 
VVVHGYFTEVYYQj:.ESTQITTDTGIENATFFIFNKLVKDPPNVQIHTIDG 
SSGVANEAMDPIYDEPTTTTSVPL (SEQ ID NO: 66). 

FIGURE 18 



MMPTTLFAGTHITMTTVYHITVSQIQLSLLKVTAFQHQNSKKTTKLWIL 
RIGTQVLKTMSLYMAISPKFTTSLSLHKLLQTLVLKMLHSSSLTSLLKTH 
RMCKYTQSTALQELLIQQWIQFMMSRIUUiLACLCKHKKVSTNLCTHSFRK 
KQVR (SEQ ID NO: 67) 

FIGURE 19 

> 

MFHLVDFQVT I AEILI I IMRTFRI AIWNLDVI I S S I VRQLFKPLTKKNYS 
ELDDEEPMELDYP (SEQ ID NO: 68) 

FIGURE 20 

MKIILFLTLIVFTSCELYHYQECVRGTTVLLKEPCPSGTYEGNSPFHPIiA 

DNKFALTCTSTHFAFACADGTRHTYQLRARSVSPKLFIRQEEVQQELYSP 
LFIiIVAALVFLILCFTIKRKTE (SEQ ID NO: 69) 

FIGURE 21 

MNELTLIDFYLCFLAFLLFLVLIMLIIFWFSIiEIQDIiEEPCTKV 

(SEQ ID NO: 70) 

FIGURE 22 

MKLLI VLTCI SLCSCICTWQRCASNKPHVLEDPCKVQH 
(SEQ ID NO: 71) 

FIGURE 23 



MCLKILVRYNTRGNTYSTAVJLCALGKVIiPFHRWHTMVQTCTPNVTINCQD 
PAGGALIARCWYLHEGHQTAAFRDVLWLNKRTN (SEQ ID NO: 72) 
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MDPNQTNWPPALHLVDPQIQLTITRJffiDAMGQGQNSADPKVYPIILRLCi 
SQLSLSMARRNLDSLEARAFQSTPIVVQMTKLATTEELPDEFVVVTAK . 

(SEQ ID NO: 73) ' 

FIGURE 25 



MLPPCYNFLKEQHCQKASTQREAEAAVKPLLAPHHWAVIQEIQLLAAVG 
EILLLEWLAEWKLPSRYCC (SEQ ID NO: 74) 



FIGURE 26 

CIAVGQLCVFWNIGRPCCSGLCVFA— CTVKL conotoxin 
CISLCS-CICTWQRCASNKPHVLEDPCKVQH sars 

If 'k a 

• • • ••• • •• 
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SEQUENCE LISTING 

*» 

<110> BC CANCER AGENC? 

<120> SARS VIRUS NUCLEOTIDE AND AMINO ACID SEQUENCES AND USES THEREOF 

i 

<130> 82936-7 . ' 

' ... 

<160> 206 

m • 

<170> Patentin version 3.3 

<210> 1 
<211> 29736 
<212> .DNA 

<213> Severe acute respiratory syndrome virus 
<40p> 1 

ctacccagga aaagccaacc aacctcgatc tcttgtagat ctgttctcta aacgaacttt jSO 
aaaatctgtg ' tagctgtcgc tcggctgcat gcctagtgca cctacgcagt ataaacaata 120^ 

« m 

ataaatttta ctgtcgttga caagaaacga gtaactcgtc cctcttctgc agactgctta 180 
cggtttcgtc cgtgttgcag tcgatcatca gcatacctag *gtttcgt.ccg ggtgtgaccg 240 
aaaggtaaga tggagagc^t tgttcttggt gtcaacgaga aaacacacgt ccaactcagt 300 

■ « 

ttgcctgtcc ttcaggttag agacgtgcta gtgcgtggct tcggggactc tgtggaagag * .360 

gccctatcgg aggcacgtga acacctcaaa- aatggcactt gtggtctagt agagctggaa 420 

aaaggcgtac tgccccagct tgaacagccc tatgtgttca ttaaacgttc tgatgcctta 480 

agcaccaatc acggccacaa ggtcgttgag ctggttgcag aaatggacgg cattcagtac - 540 
ggtcgtagcg gtataacact gggagtactc gtgccacatg tgggcgaaac cccaattgoa 

taccgcaatg ttcttcttcg taagaacggt aataagggag ccggtggtca tagctatggc. 660 

atcgatctaa agtcttatga cttaggtgac gagcttggca ctgatcccat tgaagattat. 720 

gaacaaaact ggaacactaa. gcatggcagt ggtgcactcc gtgaactcac tcgtgagctc • ISO 

aatggaggtg cagtcactcg ctatgtcgac aacaatttct gtggcccaga tgggtaccct 840 

cttgattgca tcaaagattt tctcgcacgc gcgggcaagt caatgtgcac • tctttccgaa 900 

caact€gatt acatcgagtc gaagagaggt gtctactgct gccgtgacca tgagcatgaa . 960 

attgcctggt tcactgagcg ctctgataag agctacgagc accagacacc cttcgaaatt 1020 

aagagtgcca agaaatttga cactttcaaa ggggaatgcc caaagtttgt gtttcctctt 1080. 

aactcaaaag tcaaagtcat tcaaccacgt gttgaaaaga aaaagactga gggtttcatg 1140 

■ 

gggcgtatac gctctgtgta ccctgttgca tctccacagg agtgtaacaa tatgcacttg 1200 

tctaccttga tgaaatgtaa tcattgcgat gaagtttcat ggcagacgtg cgactttctg 1260 

aaagccactt gtgaacattg tggcactgaa aatttagtta ttgaaggacc tactacatgt 1320 



600 



ft 



I 
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gggtacctac ctactaaigc tgtagtgaaa atgccatgtc ctgcctgtca agacccagag 1380 

attggacctg agcatagtgt tgcagattat cacaaccact caaacattga aactcgactc 1440 

cgcaagggag gtaggactag atgttttgga ggctgtgtgt ttgcctatgtf tggctgctat 1500 

aataagcgtg cctactgggt fccctcgtgct agtgctgata ttggctcagg •ccatactggc 1560 

r 

attactggtg acaatgtgga gaccttgaat gaggatctcc ttgagatact gagtcgtgaa 1620 

cgtgttaaca ttaacattgt tggcgatttt catttgaatg aagaggttgc catcattttg 1680 

gcatctttct ctgcttctac aagtgccttt attgacacta taaagagtct tgattacaag 1^40 

tctttcaaaa ccattgttga gtcctgcggt aactataaag ttaccaaggg aaagcccgta 1800 

aaaggtgctt ggaacattgg acaacagaga tcagttttaa caccactgtg tggttttccc 1860 

tcacaggctg ctggtgttat cagatcaatt" tttgcgcgca cacttgatgc agcaaaccac - 1920 

tcaattcctg. atttgcaaag agcagctgtc accatacttg atggtatttc tgaacagtca 1980 

« * 

ttacgtcttg tcgacgccat ggtttatact tcagacctgc tcaccaacag tgtcattatt 2040 

. atggcatatg taactggtgg tcttgtacaa cagacttctc agtggttgtc taatcttttg 2100 

ggcactactg ttgaaaaact caggcctatc tttgaatgga ttgaggcgaa acttagtgca 2160 

■ 

> < 

ggagttgaat ttctcaagga tgcttgggag attctcaaat ttctcattac aggtgttttt 2220 

• . • - 

gacatcgtca agggtcaaat acaggttgct tcagataaca tcaWggattg tgtaaaatgc 2280 

ttcattgatg ttgttaacaa ggcactcgaa atgtgcattg atcaagtcac tatcgctggc 2340 

^caaagttgc gatcactcaa cttaggtgaa gtcttcatcg ctcaaagcaa gggactttac 2400 

cgtcagtgta tacgtggcaa ggagcagctg caactact6a tgcctcttaa .ggcaccaaaa 2460 

gaagtaacct ttcttgaagg tgattcacat gacacagtac ttacctctga ggaggttgtt 2520 

ctcaagaacg gtgaactcga agcactcgag acgcccgttg atagcttcac aaatggagct 2580 

atcgtcggca caccagtctg tgtaaatggc. ctcatgctct tagagattaa ggacaaagaa 2640 

caatactgcg cattgtctcc tggtttactg gctacaaaca atgtctttcg cttaaaaggg 2700 

ggtgcaccaa ttaaaggtgt aacctttgga gaagatactg tttgggaagt tcaaggttac 2760 

aagaatgtga gaatcacatt tgagcttgat gaacgtgttg acaaagtgct taatgaaaag 2820 

tgctctgtct acactgttga atccggtacc gaagttactg agtttgcatg tgttgtagca 2880 

gaggctgttg tgaagacttt acaaccagtt tctgatctcc ttaccaacat gggtattgat 2940 

cttgatgagt ggagtgtagc tacattctac ttatttgatg atgctggtga agaaaiacttt * 3000 

tcatcacgta tgtattgttc cttttaccct ccagatgagg aagaagagga cgatgcagag 3060 

tgtgaggaag aagaaattga tgaaacctgt gaacatgagt acggtacaga ggatgattat 3120 

2 
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caaggtctcc ctctggaatt tggtgcctca .gctgaaacag ttcgagttga ggaagaagaa 3180 

gaggaagact ggctggatga tactactgag c^iatcagaga ttgagccaga accagaacct * 3240 

• • • . 

acacctgaag aaccagttaa tcagtttact ggttatttaa aacttactga caatgttgcc * 3300 

« 

attaaatgtg ttgacatcgt taaggaggca caaagtgcta atcctatggt gattgtaaat' 3360 

■ 

gctgctaaca tacacctgaa acatggtggt ggtgtagcag. gtgcactcaa caaggcaacc 3420 

■ • 

aatggtgcca tgcaaaagga gagtgatgat tacattaagc taaatggccc* tcttacagta 3480 

• • * • • • • . . 

ggagggtctt gtttgctttc tggacataat cttgctaaga agtgtctgca tgttgttgga 3540 

4 

cctaacctaa atgcaggtga ggacatccag cttcttaagg cagcatatga aaatttcaat 3600 

tcacaggaca tcttacttgc accattgttg tcagcaggca tatttggtgc taaaccactt 3660 

cagtctttac aagtgtgcgt gcagacggtt cgtacacagg tttatattgc agtc^atgac 3720 

aaagctcttt- atgagcaggt tgtcatggat tatcttgata acctgaagcc tagagtggaa 3780 

gcacctaaac aagaggagcc accaaacaca gaagattcca aaactgagga gaaatctgtc 3840 

. gtacagaagc ctgtcgatgt gaagccaaaa attaaggcct gcattgatga ggttaccaca 3900 

.acactggaag aaactaagtt tcttacca^t aagttactct tgtttgctga. tatcaatggt 3960 
aa'gcttt'acc atgattctca gaacatgctt agaggtgaag atatgtcttt ccttgagaag * 4020' 

' ■ ■ 

gatgcacctt acatggtagg tgatgttatc actagtggtg atatcacttg tgttgtaata - 4080 

ccctccaaaa aggctggtgg cactactgag atgctctcaa gagctttgaa gaaagtgcca 4140 

gttgatgagt atataaccac gtaccctgga .caaggatgtg ctggttatac acttgaggaa 4200 

• s 

I « 

gctaagactg ctcttaagaa atgcaaatct gcattttatg tactaccttc agaagcacct 4260 

aatgctaagg aagagattct aggaactgta tcctggaatt tgagagaaat gcttgctcat 4320 

gctgaagaga caagaaaatt aatgcctata tgcatggatg ttagagccat aatggcaacc. 4380 

atccaacgta agtataaagg. aattadaatt caagagggca tcgttgacta tggtgtccga • 4440 

ttcttctttt atactagtaa agagcctgta gcttctatta ttacgaagct gaactctcta 4500 

aatgagccgc- ttgtcacaat gccaattggt tatgtgacac atggttttaa tcttgaagag 4560 

gctgcgcgct gtatgcgttc tcttaaagct cctgccgtag tgtcagtatc atcaccagat .4620 

gctgttacta catataatgg atacctcact tcgtcatcaa agacatctga ggagcacttt 4680 

gtagaaacag tttctttggc tggctcttac agagattggt cctattcagg acagcgtaca . 4740 

gagttaggtg ttgaatttct taagcgtggt gacaaaattg tgtaccacac tctggagagc 4800 

cccgtcgagt ttcatcttga cggtgaggtt ctttcacttg acaaactaaa gagtctctta 4860 

tccctgcggg aggttaagac tataaaagtg ttcacaactg tggacaacac taatctccac 4920 

acacagcttg tggatatgtc tatgacatat ggacagcagt ttggtccaac atacttggat 4980 



I 
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ggtgctgatg ttacaaaaat taaacctcat gtaaatcatg agggt^agac tttctttgta 5040 

ctacctagtg atgacacact acgtagtgaa gctttcgagt actaccatac tcttgatgag 5100 

agttttcttg. gtaggtacat gtctgcttta aaccacacaa agaaatggaa. atttcctcaa 5160 

gttggtggtt taacttcaat taaatgggct gataacaatt gttatttgtc tagtgtttta • 5220 

I 

ttagcacttc aacagcttga agtcaaattc ^atgcaccag cacttcaaga ggcttattat 5280 

agagcccgtg ctggtgatgc tgctaacttt tgtgcactca tactcgctta cagtaataaa 534-0 

actgttggcg agcttggtga tgtcagagaa actatgaccc atcttctaca gcatgctaat 5400 

• • • 

ttggaatctg caaagcgagt tcttaatgtg gtgtgtaaac attgtggtca gaaaactact 5460 

acctta'acgg gtgtagaagc tgtgatgtat atgggtactc tatcttatga taatcttaag 5520 

acaggtgttt ccattccatg tgtgtgtggt cgtgatgcta cacaatatct agtacaacaa • 5580 

gagtcttctt. ttgttatgat gtctgcacca cctgctgagt ataaattaca gcaaggtaca , 5640 

ttcttatgtg cgaatgagta cactggtaac tatcagtgtg gtcattacac tcatataact 5700 

gctaaggaga ccctctatcg tattgacgga gctcacctta caaagatgtc agagtacaaa 5760 

ggaccagtga ctgatgtttt ctacaaggaa acatcttaca ctacaaccat caagcotgtg 5820 

tcgtataaac tcgatggagt tacttacaca gagattgaac caaaattgga tgiggtattat 5880 

aaaaaggata atgcttacta tacagagcag cctatagacc ttgtaccaac tcaaccatta 5940 

ccaaatgcga gttttgataa tttcaaactc acatgttcta acacaaaatt tgctgatgat 6000 
• " * - ..... 

ttaaatcaaa tgacaggctt cacaaagcca gcttcacgag agctatctgt cacattcttc 6060 

ccagacttga atggcgatgt agtggctatt .gactatagac actattcagc gagtttcaag 6120 

aaaggtgcta aattactgca"' taagccaatt gtttggcaca ttaaccaggc tacaaccaag 6180 

' acaacgttca aaccaaacac ttggtgttta cgttgtcttt ggagtacaaa gccagtagat 6240 

acttcaaatt • catttgaagt tctggcagta . gaagacacac aaggaatgga eaatcttgct 6300 

tgtgaaagtc aacaacccac ctctgaagaa gtagtggaaa atcctaccat acagaaggaa 6360 

gtcatagagt gtgacgtgaa aactaccgaa gttgtaggca atgtcatact taaaccatca 6420 

9 

• ■ 

gatgaaggtg ttaaagtaac acaagagtta ggtcatgagg atcttatggc tgcttatgtg 6480 

gaaaacacaa gcattaccat taagaaacct aatgagcttt cactagcctt aggtttaaaa 6540 

acaattgcca ctcatggtat tgctgcaatt aatagtgttc cttggagtaa aattttggct 6600 

.tatgtcaa'ac cattcttagg acaagcagca attacaacat caaattgcgc taagagatta 6660 

gcacaacgtg tgtttaacaa ttatatgcct tatgtgttta cattattgtt ccaattgtgt 6720 

acttttacta aaagtaccaa ttctagaatt agagcttcac tacctacaac tattgctaaa 6780 



4 
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aatagtgtta agagtgttgc taaattatgt .ttggatgccg gcattaatta tgtgaagtca 6840 

cccaaatttt ctaaattgtt pacaatcgct atgtggctat tgttgttaag tatttgctta 6900 

ggttctctaa tc'tgtgtaac tgctgctttt ggtgtactct tatctaattt tggtgctcct ''6960 

tcttattgta atggcgttag acfaattgtat cttaattcgt ctaacgttac tactatggat 7020 

ttctgtgaag gttcttttcc ttgcagcatt tgtttaagtg gattagactc ccttgattct 7080 

tatccagctc ttgaaaccat tcaggtgacg atttcatcgt acaagctaga cttgacaatt 7140 

tt.aggtctgg ccgctgagtg ggttttggca tatatgttgt tcacaaaatt cttttattta 7200 

ttaggtcttt cagctataat gcaggtgttc tttggctatt ttgctagtca t.ttcatcagc 7260 

aattcttggc tcatgtggtt tatcattagt attgtacaaa tggcacccgt ttctgcaatg 7320 

gttaggatgt acatcttctt tgcttctttc tacta'catat ggaagagcta tgttcatatc 7380 

atggatggtt gcacctcttc gacttgcatg atgtgctata agcgcaatcg tgccacacgc 744a 

gttgagtgta caactattgt taatggcatg aagagatctt tctatgtcta tgcaa'atgga 7500 

« ■ 

ggccgtggct tctgcaagac tcacaattgg aattgtctca attgtgacac attttgcact 7560 

* 

.ggtagtacat tcattagtga tgaagttgpt cgtgatttgt cactccagtt taaaagacca 7620 

atcaacccta ctgaccagtc atcgtatatt gttgatagtg ttgctgtgaa aaatggcgcg • 7680 

ct'tcacctct • actttgacaa ggctggtcaa aagacctatg agagacatcc gctctcccat 7740 

* . ■ ' • • ■ • 

tttgtcaatt tagacaattt gagagctaac aacactaaag gttcactgcc tattaatgtc 7800 

• * 

atagtttttg atggcaagtc caaatgcgac gagtctgctt ctaagtctgc ttctgtgtac • 7860 

tacagtcagc tgatgtgcca' acctattctg ttgcttgacc aagctcttgt atcagacgtt 7920 

• • • 

ggagatagta ctgaagttitc cgttaagatg tttgatgctt atgtcgacac cttttcagca 7980 

4 

acttttagtg ttcctatgga aaaacttaag gcacttgttg ctacagctca cagcgagtta/ 8040 

gcaaagggtg tagctttaga. tggtgtcctt tctacattcg tgtcagctgc ccgacaaggt • 8100 

gttgttgata ccgatgttga cacaaaggat gttattgaat gtctcaaact ttcacatcac 8160 

tctgacttag aagtgacagg tgacagttgt aacaatttca tgctcaccta taata.aggtt 8220 

gaaaacatga cgcccagaga tcttggcgca tgtattgact gtaatgcaag gcatatcaat . 8280 

* 

gcccaagtag caaaaagtca caatgtttca ctcatctgga atgtaaaaga ctacatgtct 8340 

ttatctgaac agctgcgtaa acaaattcgt agtgctgcca agaagaacaa catacctttt . 8400 

agactaactt gtgctacaac tagacaggtt gtcaatgtca taactactaa aatctcactc 8460 

aagggtggta agattgttag - tacttgtttt aaacttatgc ttaaggccac attattgtgc 8520 

gttcttgctg cattggtttg ttatatcgtt atgccagtac atacattgtc aatccatgat 8580 

ggttacacaa atgaaatcat tggttacaaa gccattcagg atggtgtcac tcgtgacatc 8640 
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atttctactg atgattgttt tgcaaataaa catgctggtt- ttgacgcatg gtttagccag 8700 

cgtggtggtt catacaaaaa tgacaaaagc tgccctgtag tagctgctat cattacaaga 8760- 

» ... 

gagattggtt tcatagtgcc tggcttaccg ggtactgtgc tgagagcaat, caatggtgac 8820 

ttcttgcatt ttctacctcg tgtttttagt gctgttggca acatttgcta cacaccttcc 8880 • 

I 

aaactcattgagtatagtga ttttgctacc tctgcttgcg ttcttgctgc tgagtgtaca 8940 

atttttaagg atgctatggg caaacctgtg ccatattgtt atgacact^a tttgctagag 9000 

■ ■ ■ ■ 

ggttctattt cttatagtga gcttcgtcca gacactcgtt atgtgcttat ggatggttcc 9060 

■ 

atcatacagt ttcctaacac ttacctggag ggttctgtta gagtagtaac aacttttgat 9120 

gctgagtact gtagacatgg tacatgcgaa aggtcagaag taggtatttg cctatctacc 9180 

t 

agtggtagat gggttcttaa taatgagcat tacagagctc tatcaggagt tttctgtggt 9240 

gttgatgcga tgaatctcat agctaacatc tttactcctc ttgtgcaacc tgtgggtgct 9300 

ttagatgtgt ctgcttcagt agtggctggt ggtattattg ccatattggt gacttgtgCt 9360 . 

« • 

gcctactact ttatgaaatt cagacgtgtt tttggtgagt acaaccatgt tgttgctgct 9420 

• * 

aatgcacttt. tgtttttgat gtctttcact atactctgtc tggtacpagc ttacagctt't ' 9480 

ctgccgggag tctactcagt cttttacttg tacttgacat tctatttcac caatgatgtt 9540 

tcattcttgg ctcaccttca atggtttgcc atgttttctc ctattgtgcc tttttggata * 9600 
acagcaatct atgtattctg tatttctctg aagcactgcc attggttctt taacaactat-* 9660 

* • • • 

cttaggaaaa gagtcatgtt taatggagtt acatttagta ccttcgagga ggctgctttg 9720 

tgtacctttt tgctcaacaa ggaaatgtac ctaaaattgc gtagcgagac actgttgcca 9780 
• • • . 

cttacacagt ataacaggta tcttgctcta tataacaagt acaagtattt cagtggagcc 9840 * 

ttagatacta ccagctatcg tgaagcagct tgctgccact tagcaaaggc tctaaatgac 9900 

« 

tttagcaact caggtgctga tgttctctac caaccaccac agacatcaat cacttctgct 9960 

gttctgcaga gtggttttag gaaaatggca ttcccgtcag gcaaagttga agggtgcatg 10020 . 

gtacaagtaa cctgtggaac tacaactctt aatggattgt ggttggatga cacagtatac 10080 

tgtccaagac atgtcatttg cacagcagaa gacatgctta atcctaacta tgaagatctg 10140 

ctcattcgpa aatccaacca t^gctttctt gttcaggctg gcaatgttca acttcgtgtt 10200 

attggccatt ctatgcaaaa ttgtctgctt aggcttaaag ttgatacttc taaccctaag 10260 

acacccaagt ataaatttgt ccgtatccaa cctggtcaaa cattttcagt tctagcatgc 10320 

tacaatggtt caccatctgg tgtttatcag tgtgccatga gacctaatca taccattaaa 10380 

ggttctttcc ttaatggatc atgtggtagt gttggtttta acattgatta tgattgcgtg 10440 
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■ • 

- 

tctttctgct atatgcatca tatggagctt ccaacaggag tacacgctgg tactgactta 10500 ' 

gaaggtaaat tctatggtcc atttgttgac agacaaactg cacaggctgc aggtacagac 10560 

• * 

acaaccataa cattaaatgt tttggcatgg ctgtatgctg ctgttatcaa tggtgatagg 10620 

■ * 

tggtttctta ata^^attcac cactactttg aatgacttta accttgtggc aatgaagtac 10680 

aactatgaac^ ctttgacaca agatcatgtt gacatattgg .gacctctttc tgctcaaaca 10740 

ggaattgccg tcttagatat gtgtgctgct ttgaaagagc tgctgcagaa tggtatgaat 10800 

ggtcgtacta tccttggtag cactatttta gaagatgagt .ttacaccatt tgatgttgtt 10860 

agacaatgct ctggtgttac cttccaaggt aagttcaaga aaattgttaa gggcactcat 10920 

cattggatgc ttttaacttt cttgacatca ctattgattc ttgttcaaag tacacagtgg 10980 

tcactgtttt tctttgttta cgagaatgct ttcttgccat ttactcttgg tattatggca 11040 

* 

attgctgcat gtgctatgct gcttgttaag cataagcacg cattcittgtg cttgtttctg .11100 

♦ 

ttaccttctc ttgcaacagt tgcttacttt aatatggtct acatgcctgc tagctgggtg 11160 

atgcgtatca tgacatggct tgaattggct gacactagct tgtctggtta taggcttaag 11220 
gattgtgtta tgtatgcttc agctttagtt ttgcttattc tcatgacagc tcgcactgtt ' 11280 

■ « 

tatgatgatg ctgctagacg tgtttggaca ctgatgaatg tcattacact tgtttacaaa 11340 

gtctactatg gtaatgcttt agjatcaagct atttccatgt gggccttagt tatttctg^a .11400 

• • • . 

acctctaact attctggtgt cgttacgact atcatgtttt tagctagagc tatagtgttt 11460 

gtgtgtgttg agtfettaccc attgttattt attactggca acaccttaca gtgtatcatg 11520 

cttgtttatt gtttcttagg ctattgttgo tgctgctact ttggcctttt ctgtttactc 11580 

aaccgttact tcaggcttac tcttggtgtt tatgactact tggtctctac acaagaattt 11640 

■ 

aggtatatga actcccaggg gcttttgcct cctaagagta gtattgatgc tttcaagctt 11700 

aacattaagt tgttgggtat tggaggtaaa ccatgtatca aggttgctac tgtacagtct 11760. 

aaaatgtctg acgtaaagtg cacatctgtg gtactgctct cggttcttca acaacttaga 11820 

gtagagtcat cttctaaatt gtgggcacaa tgt'gtacaac tccacaatga tattcttctt 11880 

gcaaaagaca caactgaagc tttcgagaag atggtttctc ttttgtctgt tttgctatcc 11940 

atgcagggtg ctgtagacat taataggttg tgcgaggaaa tgctcgataa ccgtgctact . 12000 

cttcaggcta ttgcttcaga atttagttct ttaccatcat atgccgctta tgccactgcc 12060 

caggaggcct atgagcaggc tgtagctaat ggtgattctg aagtcgttct caaaaagtta 12120 

aagaaatctt tgaatgtggc taaatctgag tttgaccgtg atgctgccat gcaacgcaag 12180 

ttggaaaaga tggcagatca ggctatgacc caaatgtaca aacaggcaag atctgaggac 12240 

aagagggcaa aagtaactag- tgctatgcaa acaatgctct tcactatgct taggaagctt 12300 
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gataatgatg cacttaacaa ca'ttatcaac aiatgcgcgtg- atggttgtgt tccactcaac 
atcataccat tgactacagc agccaaactc atggttgttg tccctgatta tggtacctac 
aagaacactt gtgatggtaa cacctttaca .tatgcatctg cactctggga, aatccagcaa 
gttgttgatg cggatagcaa gattgttcaa cttagtgaaa ttaacatgga caattcacca 
aatttggctt ggcctcttat tgttacagct ctaagagcca actcagctgt taaactacag 
aataatgaac tgagtccagt agcacta.cga cagatgt.cct gtgcggctgg taccacacaa 
apagcttgta ctgatgacaa tgcacttgcc tactataaca attcgaa'ggg aggtaggttt 

m ■ 

■ 

gtgctggcat tactatcaga ccaccaagat ctcaaatggg ctagattccc taagagtgat 

ggtacaggta caatttacac agaactggaa ccaccttgta ggtttgttac agacacacca 

aaagggccta aagtgaaata cttgtacttc atcaaaggct taaacaacct aaatagaggt 

. ■ ■ ■ 

atggtgctgg gcagtttagc tgctacagta cgtcttcagg ctggaaatgc tacagaagta 

cctgccaatt caactgtgct ttccttctgt gcttttgcag tagaccctgc taaagcatat 

• * • 

aaggattacc tagcaagtgg aggacaacca atcaccaact .gtgtgaagat gttgtgtaca 
cacactggta caggacaggc aattactgta acaccagaag ctaacatgga ccaagagtcc 

■ 

tttggtggtg cttcatgttg tctgtattgt agatgccaca ttgaccatcc 'aaatcctaaa 
ggattctgtg acttgaaagg taagtacgtc caaataccta ccacttgtgc taatgaccca 
gtgggtt.tta cacttagaaa- cacagtctgt accgtctgcg gaatgtggaa aggttatggc-* 
tgtagttgtg accaactccg cgaacccttg atgcagtctg cggatgcatc aacgttttta 
aacgggtttg cggtgtaagt gcagcccgtc ttacaccgtg cggcacaggc actagtactg 
atgtcgtcta cagggctttt gatatttaca acgaaaaagt tgctggtttt gcaaagttcc • 
taaaaactaa ttgctgtcgc ttccaggaga aggatgagga aggcaattta ttagactctt 
actttgtagt taagaggcat actatgtcta .actaccaaca tgaagagact atttataact 

ft 

tggttaaaga ttgtccagcg gttgctgtcc. atgacttttt caagtttaga gtagatggtg 
acatggtacc acatatatca cgtcagcgtc taactaaata cacaatggct gatttagtct 
atgctctacg tcattttgat gagggtaaitt' gtgatacatt aaaagaaata ctcgtcacat 
acaattgctg tgatgatgat fatttcaata agaaggattg gtatgacttc gtagagaatc 
ctgacatctt acgcgtatat gctaacttag gtgagcgtgt acgccaatca ttattaaaga 
ctgtacaatt ctgcgatgct atgcgtgatg caggcattgt aggcgtactg acattagata 
atcaggatct taatgggaac tggtacgatt tcggtgattt cgtacaagta gcaccaggct 
gcggagttcc tattgtggat tcatattact cattgctgat gcccatcctc actttgacta 
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gggcattggc tgctgagtcc catatggatg ctgatctcgc aaaaccactt attaagtggg 14160 

» • ■ * 

m « • * * 

atttgctgaa atatgatttt a'cggaagaga gactttgtct cttcgaccgt tattttaaat 14220 

attgggacca gacataccat cccaattgta tt^actgttt ggatgatagg tgtatccttc 14280 

attgtgcaaa ctttaatgtg ttattttcta ctgtgtttcc acctacaagt tttggaccac 14340 

"... 

tagtaaga'aa aatatttgta gatggtgttc cttttgttgt ttcaactgga taccattttc 14400 

• • • ' 

gtgagttagg agtcgtacat aatcaggatg taaacttaca tagctcgcgt ctcagtttca 14460 

aggaactttt agtgtatgct gctgatccag ctatgcatgc .agcttctggc aatttattgc 14520 

tagataaacg cactacatgc ttttcagtag ctgcactaac aaacaatgtt gcttttcaaa 14580 

ctgtcaaacc cggtaatttt aataaagact tttatgactt tgctgtgtct aaaggtttct 14640 

ttaaggaagg aagttctgtt gaactaaaac. acttcttctt tgctcaggat ggcaacgctg 14700 

ctatcagtga ttatgactat tatcgttata atctgccaac aatgtgtgat atca'gacaac .14760 

tcctattcgt' agttgaagtt gttgataaat actttgattg ttacgatggt ggctgtatta 14820 

atgccaacca agtaatcgtt aacaatctgg ataaatcagc tggtttccca tttaataaat 14880 

« 

■ 

ggggtaaggc tagactttat tatgactcaa tgagttatga ggatcaagat gcacttttcg' 14940 

cgtatactaa gcgtaatgtc atccctacta taactcaaat gaatcttaag tatgccatta ISQQO 

gt^caaagaa-tagagctcgc accgtagctg gtgtctctat ctgtagtact atgacaaata .15060 

• ■ » • • 

gacagtttca tcagaaatta ttgaagtcaa tagccgccac tagaggagct actgtggtaa 15120 

ttggaacaag caagttttac ggtggctggc ataatatgtt aaaaactgtt tacagtgatg 15180 

I • 

tagaaactcc acaccttatg ggttgggatt atccaaaatg* tgacagagcc atgcctaaca 15240 

• ■ - 

tgcttaggat aatggcctct cttgttcttg ctcgcaaaca taacacttgc tgtaacttat 15300 

« * 

cacaccgttt ctacaggtta gctaacgagt gtgcgcaagt attaagtgag atggtcatgt 15360 

gtggcggctc actatatgtt aaaccaggtg gaacatcatc cggtgatgct acaactgctt 15420 

atgctaatag tgtctttaac atttgtcaag ctgttacagc caatgtaaat gcacttcttt 15480 

caactgatgg taataagata gctgacaagt atgtccgcaa tctacaacac aggctctatg 15540 
agtgtctcta tagaaat.agg gatgttgatc atgaattcgt ggatgagttt tacgcttacc* 15600 

tgcgtaaaca tttctccatg atgattcttt ctgatgatgc cgttgtgtgc tataacagta 15660 

actatgcggc tcaaggttta gtagctagca ttaagaactt taaggcagtt ctttattatc . 15720 

aaaataatgt gttcatgtct gaggcaaaat gttggactga gactgacctt actaaaggac 15780 

ctcacgaatt ttgctcacag catacaatgc tagttaaaca aggagatgat tacgtgtacc 15840 

tgccttaccc agatccatca agaatattag gcgcaggctg ttttgtcgat gatattgtca 15900 

aaacagatgg tacacttatg attgaaaggt tcgtgtcact ggctattgat gcttacccac 15960 
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ttacaaaaca tcctaatcag. gagtatgctg atgtctttca cttgtattta caatacatta 16020 

gaaagttaca tgatgagctt actggccaca tgttggacat gtattccgta atgctaaeta 16080 

* ■ • 

atgataacac ctcacggtac tgggaacctg agttttatga ^ggctatgtac. acaccacata 16140' 

cagtc'ttgca ggctgtaggt gtttgtgtat tgtgcaattc acagacttca cttcgttgcg 16200- 

gtgcctgtat taggagacca ttcctatgtt gcaagtgctg ctatgaccat gtcatttcaa 16260 

catcacacaa attagtgttg tctgtta.atc cctatgt.ttg caatgcccca ggttgtgatg 16320 

tpactgatgt gacacaactg tatctaggag gtatgagcta ttattgcaag tcacataagc 16380 

ctcccattag ttttccatta tgtgctaatg gtcaggtttt tggtttatac aaaaacacat 16440 

gtgtaggcag tgacaatgtc actgacttca atgcgatagc aacatgtgat tggactaatg 16500 

ctggcgatta catacttgcc aacacttgta ctgagagact caagcttttc. gcagcagaaa 16560 

cgctcaaagc cactgaggaa acatttaagc tgtcatatgg tattgccact gtacgcgaag 16620 

tactctctga cagagaattg catctttcat gggaggttgg aaaacctaga ccaccattga 16680 . 

» 

acagaaacta tgtctttact ggttaccgtg taactaaaaa tagtaaagta cagattggag. 16740 

agtacacctt tgaaaaaggt gactatggtg atgctgttgt gtacagaggt actacgacat 16800 

• « » 

acaagttgaa tgttggtgat tactttgtgt tgacatctca cactgtaatg ccacttagtg 16860 

cacctactcti agtgccacaa gagcactatg tgagaattac tggcttgtac ccaacactca ' 16920 

ft 

acatctcaga tgagttttct .agcaatgttg caaattat;ca aaaggtcggc atgcaaaagt ■ 16980 

actctacact ccaaggacca cctggtactg gtaagagtca ttttgccatc ggacttgctc 17.040 

tctattaccc atctgctcgc atagtgtata cggcatgctc tcatgcagct gttgatgccc 17100 

tatgtgaaaa ggcattaaaa tatttgccca tagataaatg tagtagaatc atacctgcgc 17160 

■ 

gtgcgcgcgt agagtgtttt gataaattca aagtgaattc aacactagaa cagtatgttt 17220 

tctgcactgt aaatgcattg ccagaaacaa ctgctgacat tgtagtcttt gatgaaatct 17280 

ctatggctac taattatgac .ttgagtgttg. tcaatgctag acttcgtgca aaacactacg 17340 

tctatattgg cgatcctgct caattaccag ccccccgcac attgctgact aaaggcacac " 17400 

• . ... 

tagaaccaga atattttaat tcagtgtgca gacttatgaa aacaataggt ccagacatgt 174-60 

tccttggaac ttgtcgccgt tgtcctgctg aaattgttga cactgtgagt gctttagttt 17520 

atgacaataa gctaaaagca cacaaggata agtcagctca atgcttcaaa atgttctaca 17580 

aaggtgttat tacacatgat gtttcatctg caatcaacag acctcaaata ggcgttgtaa 17640 

gagaatttct tacacgcaat cctgcttgga gaaaa.gctgt ttttatctca ccttataatt 17700 

cacagaacgc tgtagcttca aaaatcttag gattgcctac gcagactgtt gattcatcac 17760 
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agggttctga atatgactat gtcatattca cacaaactac tgaaacagca cactcttgta .17820 

atgtcaaccg' cttcaatgtg gctatcacaa gggcaaaaat tggcattttg tgcataatgt 17880 

ctgatagaga tctttatgac aaactgcaat ttacaagtct agaaatacca cgtcgcaatg 17940 

tggctacatt acaa^qcagaa aatgtaactg gactttttaa ggactgtagt aagatcatta 18000 

ctggtcttca tcctacacag gcacctacac acctcagcgt . tgatataaag ttcaagactg 18060 

aaggattatg tgttgacata ccaggcatac caaaggacat gacctaccgt agactcatct 18120 

« 

■ 

ft ■ ■ 

ctatgatggg tttcaaaatg aattaccaag tcaatggtta .ccctaatatg tttatcaccc 1.81*80 

gcgaagaagc tattcgtcac gttcgtgcgt ggattggctt tgatgtagag ggctgtcatg 18240 

caactagaga tgctgtgggt actaacctac ctctccagct aggattttct acaggtgtta 18300 

■ 

. acttagtagc tgtaccgact ggttatgttg. acactgaaaa taacacagaa ttcaccagag 18360 

• ■ 

ttaatgcaaa acctccacca ggtgaccagt ttaaacatct tataiicactc atgtataaag 18420 

gcttgccctg gaatgtagtg cgtattaaga tagtacaaat gctcagtgat acactgaaag 18480 

gattgtcaga cagagtcgtg ttcgtccttt ' gggcgcatgg ctttgagctt acatcaatga 18540 
agtactttgt caagattgga cctgaaagaa cgtgttgtct gtgtgacaaa cgt'gcaactt ' 18600 

gcttttctac ttcatcagat acttatgcct gctggaatea ttctgtgggt tttgactatg 186(50 

tctataaccc atttatgatt gatgttcagc agtggggctt tacgggtaac cttcagagta .18720 

accatgacca acattgccag gtacatggaa atgcacatgt ggct^gttgt gatgctatca 18780 

tgactagatg tttfegcagtc catgagtgct ttgttaagcg cgttgattgg tctgttgaat 18840 

■ 

accctattat aggagatgaa ctgagggtta attctgcttg cagaaaagta caacacatgg 18900 

ttgtgaagtc tgcattgctt gctgataagt ttccagttct tcatgacatt ggaaatccaa 18960 

• • • 

• * 

aggctatcaa gtgtgtgcct caggctgaag tagaatggaa gttctacgat gctcagccat 19020. 

gtagtgacaa agpttacaaa atagaggaac tcttctattc ttatgctaca catcacgata 19080 

aattcactga tggtgtttgt ttgttttgga attgtaacgt tgatcgttac ccagccaatg 19140 

caattgtgtg taggttt'gac acaagagtct tgt'caaactt gaacttacca ggctgtgatg 19200 

gtggtagttt gtatgtgaat aagcatgcat tccacactcc agctttcgat aaaagtgcat 19260 

ttactaattt aaagcaattg cctttctttt actattctga tagtccttgt gagtctcatg . 19320 

gcaaacaagt agtgtcggat attgattatg ttccactcaa atctgctacg tgtattacac 19380 

gatgcaattt aggtggtgct gtttgcagac accatgcaaa tgagtaccga cagtacttgg 19440 

atgcatataa tatgatgatt tctgctggat ttagcctatg gatttacaaa caatttgata 19500 

cttataacct gtggaataca tttaccaggt tacagagttt agaaaatgtg gcttataatg 19560 

ttgttaataa aggacacttt gatggacacg ccggcgaagc- acctgtttcc atcattaata 19620 
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■ ■ 

atgctgttta. cacaaaggta gatggtattg atgtggagat ctttg^aaat aagacaacac 19680 

ttcctgttaa tgttgcattt gagctttggg ctaagcgtaa cattaaacca gtgccagaga 19140 

• * 

tt'aagatact. caataatttg ggtgttgata tcgctgctaa tactgtaatc .tgggactaca 19800 

aaagagaagc cccaigcacat gtatctacaa taggtgtctg cacaatgact tfacattgcca • 19860 

agaaacctac tgagagtgct tgttcttcac ttactgtctt gtttgatggt agagtggaag 19920 

gacaggtaga cctttttaga aacgcccgta atggtgtttt aataacagaa ggttcagtca 19980 

aaggtctaac accttcaaag ggaccagcac aagctagcgt caatggagtc acattaattg 20040 

* 

gagaatcagt aaaaacacag tttaactact ttaagaaagt- agacggcatt attcaacagt 20100 

tgcctgaaac ctactttact cagagcagag acttagagga ttttaagccc agatcacaaa 20160 

• • • ' 

tggaaactga ctttctcgag ctcgctatgg atgaattcat acagcgatat aagctcgagg 20220 

gctatgcctt- cgaacacatc gtttatggag atttcagtca tggacaactt .ggcggtcttc .20280 

■atttaatgat aggcttagcc aagcgctcac aagattcacc acttaaatta gaggatttta 20340 

tccctatgga cagcacagtg aaaaattact tcataacaga tgcgcaaaca ggttcatcaa 20400 

* ■ 

. ■ 

aatgtgtgtg ttctgtgatt gatcttttac ttgatgactt tgtcgagata ataaagtcac 20460 

* 

aagatttgtc agtgatttca aaagtggtca aggttacaat tgactatgct gaaatttcat 20520 

tcatgctttg gtgtaaggat ggacatgttg aaaccttcta cccaaaacta caagcfeiagtc 20580 

■ 

gagcgtggca accaggtgtt gcgatgccta acttgtacaa gatgcaaaga atgcttcttg 20640 

aaaagtgtga ccttcagaat tatggtgaaa atgctgttat accaaaagga ataatgatga 20700 

atgtcgcaaa gtatactcaa ctgtgtcaat .acttaaatac acttacttta gctgtaccct 20760 

acaacatgag agttattcac tttggtgctg gctctgataa' aggagttgca ccaggtacag 20820 

■ 

• ctgtgctcag acaatggttg ccaactggca cactacttgt cgattcagat cttaatgact 20880 
tcgtctccga cgcatattct actttaattg. gagactgtgc aacagtacat acggctaat.a 20940 
aatgggacct tattattagc gatatgtatg. accctaggac caaacatgtg acaaaagaga 21000 
atgactctaa agaagggttt ttca.cttatc tgtgtggatt tataaagcaa aaactagccc 21060 
tgggtggttc tatagctgta aagataabag agcattcttg gaatgctgac ctttacaagc 21120 

* ttatgggcca tttctcatgg tggacagctt ttgttacaaa tgtaaatgca tcatcatcgg 2il80 
aagcattttt aattggggct aactatcttg gcaagccgaa ggaacaaatt gatggctata 21240 
ccatgcatgc taactacatt ttctggagga acacaaatcc tatccagttg tcttcctatt 21300 
cactctttga catgagcaaa tttcctctta aattaagagg- aactgctgta atgtctctta 21360 
aggagaatca aatcaatgat atgatttatt ctcttctgga aaaaggtagg cttatcatta 21420 
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g^gaaaacaa cagagttgtg gtttcaagtg atattcttgt taacaactaa acgaacatgt 21480 

• • • - 

ttattttctt attatttctt abtctcacta gtggtagtga ccttgaccgg tgcaccactt 21540 

■ 

ttgatgatgt tcaagctcct aattacactc aacatacttc atctatgagg ggggtttact 21600 

atcctgatga aatttttaga tcagacactc tttatttaac tcaggattta tttcttccat ' 21660 

.... 

tttattctaa tgttacaggg tttcatacta ttaatcatac gtttggcaac cctgtcatac 21720 

« • * « 

cttttaagga tggtatttat tttgctgcca cagagaaatc aaatgttgtc cgtggttggg 21780 

tttttggttc taccatgaac aacaagtcac agtcggtgat tattattaac aattct'acta 21840 

* 

atgttgttat acgagcatgt aactttgaat tgtgtgacaa ccctttcttt gctgtttcta 2190P 

« 

aacccatggg tacacagaca catactatga tattcgataa tgcatttaat tgcactttcg 21960 

agtacatatc tgatgccttt tcgcttgatg tttca'gaaaa gtcaggtaat tttaaacact 22020 

tacgagagtt tgtgtttaaa aataaagatg ggtttctcta tgtttataag ggctatcaac 22080 

ctatagatgt agttcgtgat ctaccttctg gttttaacac tttgaaacct atttttaagt 22140 

tgcctcttgg tattaacatt acaaatttta gagccattct tacagccttt tcacctgctc 22200 

-aagacatttg gggcacgtca gctgcagcct attttgttgg ctatttaaag ccaactacat 22260 

ttatgctcaa gtatgatgaa aatggtacaa tcacagatgc tgttgattgt tctcaaaatc '22320 

« 

cacttgctga actcaaatgc tctgttaaga gctttgagat tgacaaagga atttaccaga 22380 

9 

cctctaattt cagggttgtt ccctcaggag atgttgtgag attccctaat attacaaact 22440 

tgtgtccttt tggagaggtt tttaatgcta ctaaattccc ttctgtctat* gcatgggaga 22500 

« 

gaaaaaaaat ttctaattgt ' gttgctgatt actctgtgct ctacaactca acattttttt 22560 

caacctttaa gtgctatggc gtttctgcca ctaagttgaa tgatctttgc ttctccaatg 22620 

tctatgcaga ttcttttgta gtcaagggag atgatgtaag acaaatagcg ccaggacaaa, 22680 

ctggtgttat tgctgattat. aattataaat tgccagatga tttcatgggt tgtgtccttg 22740 

cttggaatac taggaacatt gatgctactt caactggtaa ttataattat aaatataggt 22800 

atcttagaca tggcaagctt aggccctttg agagagacat atctaatgtg cctttctccc 22860 

ctgatggcaa accttgcacc ccacctgctc ttaattgtta ttggccatta aatgattatg .22920 

gtttttacac cactactggc attggctacc aaccttacag agttgtagta ctttcttttg 22980 

aactttteaa tgcaccggcc acggtttgtg gaccaaaatt atccactgac cttattaaga 23040 

accagtgtgt caattttaat tttaatggac tcactggtac tggtgtgtta actccttctt 23100 

caaagagatt tcaaccattt caacaatttg gccgtgatgt ttctgatttc actgattccg 23160 

-ttcgagatcc taaaacatct gaaatattag acatttcacc ttgcgctttt gggggtgtaa 23220 

gtgtaattac acctggaaca aatgcttcat ctgaagttgc tgttctatat caagatgtta 23280 

13 
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* m m ' 

actgcactga- tgtttctaca gcaattcatg cagatcaact cacaccagct tggcgcatat 23340 

m t 

■ ■ 

attctactgg aaacaatgta ttccagactc* aagcaggctg tcttat'agga gctgagcatg 23400 

■ 

tcSgacacttc ttatgagtgc gacattccta ttggagctgg catttgtgct .agttaccata 23460 

* 

cagtttcttt attacgtagt actagccaaa aatctattgt ggcttatact atgtctttag- 23520 

I 

gtgctgatag ttcaattgct tactctaata. acaccattgc tatacctact aacttttcaa 23580 

ttagcattac tacagaagta atgcctgttt ctatggctaa aacctccgta gattgtaata 23640 

tgtacatctg cggagattct actgaatgtg ctaatttgct tctccaatat ggtagctttt 23700 

gcacacaact aaatcgtgca ctctcaggta ttgctgctga acaggatpgc aacacacgtg '23760 

aagtgttcgc tcaagtcaaa caaatgtaca aaaccccaac tttgaaatat tttggtggtt 23820 

ttaatttttc acaaatatta cctgaccctc taaagccaac taagaggtct tttattgagg 23880 

acttgctctt taataaggtg acactcgctg atgctggctt catgaagcaa tatggcgaat 23940 

gcctaggtga tattaatgct agagatctca tttgtgcgca gaagttcaat ggacttacag 24000 

■ 

tgttgccacc tctgctcact gatgatatga ttgctgccta cactgctgct ctagttagtg 24060 

gtactgccac tgctggatgg acatttggtg ctggcgctgc tcttcaaata ccttttgcta 24120 

tgcaaatggc atataggttc aatggcattg gagttaccca aaatgttctc tatgagaacc 24180 

aaaaacaaat cgccaaccaa tttaacaagg cgattagtca aattcaagaa tcacttacaa 24240 

caacatcaac tgcattgggc aagctgcaag acgttgttaa ccagaatgct caagcattaa 24300 

acacacttgt taaacaactt agctctaatt ttggtgcaat ttcaagtgtg ctaaatgata 24360 

■ ■ 

tcctttcgcg acttgataaa gtcgaggcgg aggtacaaat tgacaggtta attacaggca 24420 

gacttcaaag ccttcaaacc tatgtaacac aacaactaat cagggctgct gaaatcaggg 24480 

cttctgctaa tcttgctgct actaaaatgt ctgagtgtgt tcttggacaa tcaaaaagag 24540 

ttgacttttg- tggaaagggc taccacctta tgtccttccc acaagcagcc ccgcatggtg 24 600 

ttgtcttcct acatgtcacg tatgtgccat . cccaggagag gaacttcacc acagcgccag 24660 

caatttgtca tgaaggcaaa gcatacttcc ctcgtgaagg tgtttttgtg tttaatggca 24720 

cttcttggtt tattacacag aggaacttct tttctccaca aataattact acagacaata 24780 

catttgtctc aggaaattgt gatgtcgtta ttggcatcat taacaacaca gtttatgatc 24840 

ctctgcaacc tgagcttgac tcattcaaag aagagctgga caagtacttc aaaaatcata 24900 

catcaccaga tgttgatctt ggcgacattt caggcattaa cgcttctgtc gtcaacattc 24960 

aeaaagaaat tgaccgcctc aatgaggtcg ctaaaaattt aaatgaatca ctcattgacc 25020 

ttcaagaatt gggaaaatat gagcaatata ttaaatggcc ttggtatgtt tggctcggct 25080 
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tcattgctgg actaattgcc atcgtcatgg ttacaatctt gctttgttgc atgactagtt 25140 

gttgcagttg cctcaagggt :gcatgctctt gtggttcttg. ctgcaagttt gatgaggatg 25200 

actctgagcc agttctcaag ggtgtcaaat tacattacac ataaacg^ac ttatggattt 25260 

* 

gtttatgaga ttttttactc ttggatcaat tactgcacag ccagtaaaaa ttgacaatgc' 25320 

ttctcctgca agtactgttc atgctacagc aacgataccg ctacaagcct. cactcccttt 25380 

« m 

cggatggctt gttattggcg ttgcatttct tgctgttttt cagagcgcta' ccaaaataat 25440 

tgcgctcaat aaaagatggc agctagccct ttataagggc ttccagttca tttgcaattt 25500 

• * 

actgctgcta tt'tgttacca tctattcaca tcttttgctt gtcgctgcag gtatggaggc 25560 

gcaatttttg tacctctatg ccttgatata ttttctacaa tgcatcaacg catgtagaat 25620 

* 

tattatgaga tgttggcttt gttggaagtg caaatccaag aacccattac tttatgatgc 25680 

« 

caactacttt- gtttgctggc acacacataa ctatgactac tgtatac'cat atasicagtgt " 25740 

■ 

cacagataca att^tcgtta ctgaaggtga cggcatttca acaccaaaac tcaaagaaga 25800 

ctaccaaatt ggtggttatt ctgaggatag gcactcaggt gttaaagact atgtcgttgt 25860 

acatggctat ttcaccgaag tttactacca gcttgagtct acacaaatta ctacagacac 25920 

tggtattgaa aatgctacat tcttcatctt taacaagctt gttaaagacc caccgaatgt '25980 

■ B 

P 

gcaaatacac acaatcgacg gctcttcagg agttgctaat cc.agcaatgg atccaattta 26040 

X . • ' 

tgatgagccg acgacgacta ctagcgtgcc tttgtaagca caagaaagtg agtacgaact 26100 

tatgt$ictca- ttcgtttcgg aagaaacagg tacgttaata gttaatagcg tacttctttt 26160 

tcttgctttc gtggtat'tct tgctagtcac actagccatc cttactgcgc ttcgattgtg • 26220 

tgcgtactgc tgcaatattg ttaacgtgag tttagtaaaa ccaacggttt acgtctactc 26280 • 

gcgtgttaaa aatctgaact cttctgaagg agttcctgat cttctggtct aaacgaacta 26340 

■ f 

actattatta ttattctgtt, tggaacttta acattgctta tcatggcaga caacggtact -26400 

attaccgttg aggagcttaa acaactcctg gaacaatgga acctagtaat aggtttccta 26460 

ttcctagcct ggattatgtt actacaattt gcctattcta atcggaacag gtttttgtac 26520 

ataataaagc ttgttttcct ctggctcttg tggccagtaa cacttgcttg ttttgtgctt . 26580 

gctgctgtct acagaattaa ttgggtgact ggcgggattg cgattgcaat ggcttgtatt 26640 

gtaggcttga tgtggcttag ctacttcgtt gcttccttca ggctgtttgc tcgtacccgc .26700 

tcaatgtggt cattcaaccc agaaacaaac attcttctca atgtgcctct ccgggggaca 26760 

attgtgacca gaccgctcat ggaaagtgaa cttgtcattg gtgctgtgat cattcgtggt 26820 

cacttgcgaa tggccggaca ctccctaggg cgctgtgaca ttaaggacct gccaaaagag 26880 

atcactgtgg ctacatcacg aacgctttct tattacaaat taggagcgtc gcagcgtgta 26940 
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■ 

■ 

ggcactgatt caggttttgc tgcatacaac cgctaccgta ttggaaacta taaattaaat 27000 

acagaccacg ccggt.agcaa cgacaatatt gctttgctag tacagtaagt gacaacagat 27060 

gtttcatctt. ^ttgacttcc aggttacaat agcagagata ttgattatca, ttatgaggac 27120 

t m 

I 

tttcaggatt gctatttgga atcttgacgt tataataagt tcaatagtga gacaattatt 27180 

I 

taagcctcta actaagaaga attattcgga gttagatgat gaagaaccta tggagttaga 27240 

ttatccataa aacgaacatg aaaattattc tcttcctgac attgattgta tttacatctt 27300 

■ 

gcgagctata tcactatcag gagtgtgtta gaggtacgac tgtactacta aaagaacctt 27360 

« • * 

ft 

gcccatcagg aacatacgag ggcaattcac catttcaccc tcttgctgac aataaatttg 27420 

ft 

cactaacttg cactagcaca cactttgctt ttgcttgtgc tgacggtact cgacatacct 27480 

atcagctgcg tgcaagatca gbttcaccaa aacttttcat cagacaagag gaggttcaac 27540 

aagagctcta ctcgccactt tttctcattg ttgctgctct agtattttta atactttgct 27600 

tcaccattaa gagaaagaca gaatgaatga gctcacttta attgacttct atttgtgctt 27660 

tttagccttt ctgctattcc ttgttttaat aatgcttatt atattttggt tttcactcga 27720 

aatccaggat ctagaagaac cttgtaccaa agtctaaacg* aacatgaaac ttctcattgt 27780- 

* ft 

tttgacttgt atttctctat gcagttgcat atgcactgta. gtacagcgct gtgcatctaa 27840 

I • • • . 

taaacctcat gtgcttgaag atccttgtaa ggtacaacac taggggtaat acttatagca 27900 

ctgcttggct ttgtgctcta ggaaaggttt taccttttca tagatggcac actatggttc 27960 

aaacatgcac acctaatgtt actatcaact gtcaagatcc agctggtggt gcgcttatag 28020 

• • • 

ctaggtgttg gtaccttcat gaaggtcacc aaactgctgc atttagagac gtacttgttg .28080 

« 

ttttaaataa acgaacaaat' taaaatgtct gataatggac cccaatcaaa ccaacgtagt 28140 

- gccccccgca ttacatttgg tggacccaca gattcaactg acaataacca gaatggagga 28200 

ft 

cgcaatgggg- caaggccaaa acagcgccga. ccccaaggtt tacccaataa taqtgcgtct 28260 

tggttcacag ctctcactca gcatggcaag gaggaactta gattccctcg aggccagggc 28320 

gttccaatca acaccaatag tggtccagat gaccaaattg gctactaccg aagagctacc 28380 

cgacgagttc gtggtggtga cggcaaaatg aaagagctca gccccagatg gtacttctat 28440 

tacctaggaa ctggcccaga agcttcactt ccctacggcg ctaacaaaga aggcatcgta 28500 
tgggttgcaa ctgagggagc cttgaataca cccaaagacc acattggcac ccgcaatcct • 28560 

.aataacaatg ctgccaccgt gctacaactt cctcaaggaa caacattgcc aaaaggcttc 28620 

tacgcagagg gaagcagagg cggcagtcaa gcctcttctc gctcctcatc acgtagtcgc 28680 

ggtaattcaa gaaattcaac tcctggcagc agtaggggaa attctcctgc tcgaatggct 28740 

ft 

> 
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« . ■ 

agcggaggtg gtgaaactgc cctcgcgcta .ttgctgctag acagattgaa ccagcttgag 28800 
agcaaagttt ctggtaaagg ccaacaacaa c^aggccaaa ctgtcactaa gaaatctgct 28860 
gctgaggcat ctaaaaagcc tcgccaaaaa cgt'actgcca caaaacagta caacgtcact '28920 
caagcatttg ggagacgtgg tccagaacaa acccaaggaa atttcgggga ccaagaccta " 28980 

. . • ' ' • . 

atcagacaag gaactgatta caaacattgg ccgcaaattg cacaatttgc tccaagtgcc 29b40 

• • • 

tctgcattct ttggaatgtc acgcattggc atggaagtca caccttcggg 'aacaiggctg 29100 
^c.ttatcatg gagccattaa attggatgac aaagatccac aattcaaaga caacgt'cata 29160 

a 

ctgctgaaca agcacattga cgcatacaaa acattcccac caacagagcc taaaaaggac 29220 

» 

* 

aaaaagaaaa agactgatga agctcagcct ttgccgcaga gacaaaaga'a gcagcccact 29280 
gtgactcttc ttcctgcggc tgacatggat gatttctcca gacaacttca aaattccatg 29340 
agtggagctt ctgctgattc aactcaggca taaacactca tgatgaccac acaaggcaga 29400* 

♦ 

tgggctatgt aaac'gttttc gcaattccgt ttacgataca tagtctactc ttgtgcagaa 29460 

* 

tgaattctcg taactaaaca gcacaagtag gtttagttaa ctttaat.ctc acatagcaat 29520 

♦ - , 

ctttaatcaa tgtgtaacat tagggaggac ttgaaagagc caccacattt tcatcgaggc 29580 

cacgcggagt acgatcgagg gtacagtgaa taatgctagg gagagctgcc tatatggaag •29.640 

agccctaatg tgtaaaatta attttagtag tgctatcccc atgtgatttt aatagcttct 29700 

taggagaatg acaaaaaaaa aaaaaaaaaa aaaaaa 29736 

■ 

<210> 2 
<2n> 29736 
<212> DNA 

<213> Severe acute respiratory syndrome virus 



<400> 2 



ctacccagga 


aaagccaacc aacctcgatc tcttgtagat 


ctgttctcta aacgaacttt 


60 


aaaatctgtg 


• • 

tagctgtcgc tcggctgcat gcctagtgca 


cctacgcagt ataa'acaata 


120 


ataaatttta 


ctgtcgttga caagaaacga gtaactcgtc 


cctcttctgc agactgctta 


180 


cggtttcgtc 


cgtgttgcag tcgatcatca gcatacctag 

• 

■ 


gtttcgtccg ggtgtgaccg 


240 


aaaggtaaga 


tggagagcct tgttcttggt gtcaacgaga 

• 


• 

aaacacacgt ccaactcagt 


300 


ttgcctgtcc 


ttcaggttag agacgtgcta gtgcgtggct 


tcggggactc tgtggaagag 


360 


gccctatcgg 


aggcacgtga acacctcaaa aatggcactt 


gtggtctagt agagctggaa 


. 420 


aaaggcgtac 


tgccccagct tgaacagccc tatgtgttca 


ttaaacgttc tgatgcctta 


480 


agcaccaatc 


acggccacaa ggtcgttgag ctggttgcag 


• 

aaatggacgg cattcagtac 


540 


ggtcgtagcg 


gtataacact gggagtactc gtgccacatg 


tgggcgaaac cccaattgca • 


•600 
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■ ■ * 

taccgcaatg ttcttcttcg taagaacggt .aataagggag ccggtggtca tagctatggc 660 

■ 

atcgatctaa- agtcttatjga cttaggtgac gagcttggca ctgatcccat tgaagattat ' 720 

* . 

gaacaaaact ggaacactaa gcatggcagt ggtgcactcc gtgaactcac tcgtgagctc ' 780 

aatggaggtg c^gtcactcg ctat'gtcgac aacaatttct gtggcccaga tgggtaccct* 840 

'• ■ * • . • 

cttgattgca tcaaagattt tctcgcacgc gcgggcaagt caatgtgcac f ctttccgaa 900 

caacttgatt acatcgagtc gaagagaggt gtctactgct gccgtgacca * tgagcatgaa 960 

attgcctggt tcactgagcg ctctgataag agctacgagc accagacacts cttcgaaatt 1020 

aagagtgcca agaaatttga cactttcaaa ggggaatgcc caaagtttgt gtttcctctt 108.0 

aactcaaaag tcaaagtcat tcaaccacgt gttgaaaaga aaaagactga gggtttcatg 1140 

gggcgtatac gctctgtgta ccctgttgca tctccacagg agtgtaacaa tatgcacttg 1200 

■ tctaccttga tgaaatgtaa tcattgcgat gaagtttcat ggcagacgtg cgactttctg 1260 

aaagccactt gtgaacattg tggcactgaa aatttagtta ttgaaggacc tactacatgt 1320 

gggtacctac ctactaatgc tgtagtgaaa atgccatgtc * ctgcctgtca agacccagag 1380 

attggacctg agcatagtgt tgcagattat cacaaccact caaacattga aactcgactc 1440 

cgcaagggag gtaggactag atgttttgga ggotgtgtgt ttgcctatgt tggctgctat ♦ 1500 

aataagcgtg cctactgggt tcctcgtgct agtgctgata ttggctcagg ccatactggc 1560 

. ■ • ■ • 

attactggtg acaatgtgga gaccttgaat gaggatctcc ttgagatact gagtcgtgaa 1620 

• • • 

m 

cgtgttaaca ttaacattgt* tggcgatttt catttgaatg aagaggttgc catcattttg • 1680 

gcatctttct ctgcttctac aagtgccttt attgacacta ta^agagtct tgattacaag 1740 

tctttcaaaa ccattgttga gtcctgcggt aactataaag ttaccaaggg aaagcccgta. 1800 

aaaggtgctt ggaacattgg acaacagaga .tcagttttaa caccactgtg tggttttccc. * 1860 

tcacaggctg ctggtgttat cagatcaatt tttgcgcgca cacttgatgc agcaaaccac 19-20 

tcaattcctg atttgcaaag agcagctgtc accatacttg atggtatttc tgaacagtca 1980 

ttacgtcttg tcgacgccat ggtttatact tcagacctgc tcaccaacag tgtcattatt 2040 

i 

atggcatatg taactggtgg tcttgtacaa cagacttctc agtggttgtc taatcttttg . 2100 

ggcactactg ttgaaaaact caggcctatc tttgaatgga ttgaggcgaa acttagtgca 2160 

* • 

ggagttgaat ttctcaagga tgcttgggag attctcaaat ttctcattac aggtgttttt . 2220. 

gacatcgtca agggtcaaat acaggttgct tcagataaca tcaaggattg tgtaaaatgc 2280 

ttcattgatg ttgttaacaa ggcactcgaa atgtgcattg atcaagtcac tatcgctggc 2340 

gcaaagttgc gatcactcaa cttaggtgaa gtcttcatcg ctcaaagcaa gggactttac 2400 

cgtcagtgta tacgtggcaa ggagcagctg caactactca tgcctcttaa ggcaccaaaa 24-60 
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gaagtaacct ttcttgaagg tgattcacat gacacagtac ttacctctga ggaggttgtt 2520 

ctcaagaacg gtgaactcga agcactcgag acgcccgttg atagcttcac aaatggagct 2580 

atcgttggca caccagtctg tgtaaatggc ctcatgctct tagagattaa ggacaaagaa 2640 

caatactgcg cattgtctcc tggtttactg gctacaaaca atgtctttcg cttaaaaggg 2700 

ggtgcaccaa ttaaaggtgt aacctttgga gaagatactg tttgggaagt tcaaggttac 2760 

aagaatgtga gaatcacatt tgagcttgat gaacgtgttg acaaagtgct taatgaaaag 2820 

tgctctgtct acactgttga atccggtacc gaagttactg agtttgcatg tgttgtagca 2880 

gaggctgttg tgaagacttt acaaccagtt tctgatctcc ttaccaacat gggtattgat 2940 

cttgatgagt ggagtgtagc tacattctac ttatttgatg atgctggtga agaaaacttt 3000 
tcatcacgta tgtattgttc cttttaccct ccagatgagg aagaagagga cgatgcagag • 3060 

t 

tgtgaggaag- aagaaattga tgaaacctgt. gaacatgagt acggtacaga ggatgattat 3120 

caaggtctcc ctctggaatt tggtgcctca gctgaaacag ttcgagttga ggaagaagaa 3180 

gaggaagact ggctggatga tactactgag caatcagaga ttgagccaga accagaacct 3240 

acacctgaag aaccagttaa tcagtttact ggttatttaa aacttactga caatgttgcc 3300 

. . ' , 

attaaatgtg. ttgacatcgt taaggaggca caaagtgcta atcctatggt gattgtaaat 3360 

gctgctaaca tacacctgaa acatggtggt ggtgtagcag gtgcactcaa caaggcaacc 3420 

aatggtgcca tgcaaaagga gagtgatgat tacattaagc taaatggccc tcttacagta 3480 

ggagggtctt gtttgcttitc tggacataat cttgctaaga agtgtctgca tgttgttgga 3540 

cctaacctaa atgcaggtga ggacaitccag cttcttaagg cagcatatga aaatttcaat 3600 

tcacaggaca tcttacttgc accattgttg tcagcaggca tatttggtgc taaaccactt 3660 

cagtctttac aagtgtgcgt gcagacggtt cgtacacagg tttatattgc agtcaatgac 3720 

aaagctcttt atgagcaggt tgtcatggat. tatcttgata acctgaagcc tagagtggaa 3760 

gcacctaaac aagaggagcc accaaacaca gaagattcca aaactgagga gaaatctgtc 3840 

gtacagaagc ctgtcgatgt gaagccaaaa attaaggcct gcattgatga ggttaccaca 3900 

acactggaag aaactaagtt tcttaccaat aagttactct tgtttgctga tatcaatggt 3960 

aagctttacc atgattctca gaacatgctt agaggtgaag atatgtcttt ccttgagaag 4020 

gatgcacctt acatggtagg tgatgttatc actagtggtg atatcacttg tgttgtaata 4080 

ccctccaaaa aggctggtgg cactactgag atgctctcaa gagctttgaa gaaagtgcca 4140 

gttgatgagt atataaccac gtaccctgga caaggatgtg ctggttatac acttgaggaa 4200 

gctaagactg ctcttaagaa atgcaaatct gcattttatg tactaccttc agaagcacct 4260 

> 
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• ■ 

* • 

aatgctaagg aagagattct aggaactgta.tcctggaatt tgagagaaat gcttgctcat 4320 

* * - ■ 

gctgaagaga caagaaaatt aatgcctata tgcatggatg ttagagccat aatggcaacc ' 4380 
■ • 

atccaacgta agtataaagg aattaaaatt caagagggca tcgttgacta tggtgtccga " 4440 

ttcttctttt atactagtaaagagcctgta gcttctatta ttacgaagct gaactctcta" 4500 

aatgagccgc ttgtcacaat gccaattggt tatgtgacac atggttttaa ^cttgaagag 4560 

gctgcgcgct gtatgcgttc tcttaaagct cctgccgtag tgtcagtatc atcaccagat 4620 

* ■ 

* 

gctgttacta cat^taatgg atacctcact tcgtcatcaa agacatctga ggagcacttt 4680 

gtagaaacag tttctttggc tggctcttac agagattggt cctattcagg aicagcgtaca 4740 

■ ■ 

gagttaggtg ttgaatttct taagcgtggt gacaaaattg tgtaccacac tctggag.agc 4800 

cccgtcgagt ttcatcttga cggtgaggtt ctttcacttg acaaactaaa gagtctctta 48.60 

tccctgcggg- aggttaagac tataaaagtg ttcacaactg tggacaacac taatctccac 4920 

acacagcttg tggatatgtc tatgacatat ggacagcagt ttggtccaac atacttggat 4980 

* 

ggtgctgatg ttacaaaaat taaacctcat gta&atcatg agggtaa^gac tttctttgta 5040 

ctacctagtg atgacacact acgtagtg.aa gctttcgagt actaccatac tcttgat'gag 5100 

« • • 

agttttdttg gtaggtacat gtctgcttta aaccacacaa agaaatggaa atttcctcaa » 5160 

gttggtggtt taacttcaat taaatgggct gataacaatt gttatttgtc tagtgtttta 5220 

V . . . . * 

ttagcacttc? aacagcttga agtcaaattc aatgcaccag cacttcaaga ggcttattat 5280 

• ■ 

agagcccgtg ctggtgatgc tgctaacttt tgtgcactca tactcgctta cagtaataaa '5340 

actgttggcg agcttggtga" tgtcagagaa actatgaccc atcttctaca gcatgctaat 5400 

ttggaatctg caaagcgagt tcttaatgtg gtgtgtaaac attgtggtca gaaaactact 5460 

accttaacgg gtgtagaagc tgtgatgtat .atgggtactc tatcttatga taatcttaag 5Si20 

acaggtgttt ccattccatg tgtgtgtggt cgtgatgcta cacaatatct agtacaacaa 5580 

gagtcttctt ttgttatgat gtctgcacca cctgctgagt ataaattaca gcaaggtaca 5640 

ttcttatgtg cgaatg^gta cactggtaac tatcagtgtg gtcattacac tcatataact 5700 

gctaaggaga ccctctatcg tattgacgga gctcacctta caaagatgtc agagtacaaa . 5760 

■ 

ggaccagtga ctgatgtttt ctacaaggaa ac^tcttaca ctacaaccat caagcctgtg 5820 

* 

tcgtataaac tcgatggagt tacttacaca gagattgaac caaaattgga tgggtattat . 5880 

aaaaaggata atgcttacta tacagagcag cctatagacc ttgtaccaac tcaaccatta 5940 

ccaaatgcga gttttgataa tttcaaactc acatgttcta acacaaaatt tgctgatgat 6000 

ttaaatcaaa tgacaggctt cacaaagcca gcttcacgag agctatctgt cacattcttc 6060 

ccagacttga atggcgatgt agtggctatt gactatagac actattcagc gagtttcaag 6120 
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aaaggtgcta- aattactgca taagccaatt gtttggcaca ttaaccaggc tacaaccaag 6180 

• • • « 

acaacgttca aaccaaacac ttggtgttta- cgttgtcttt ggagtacaaa gccagtagat 6240 

acttcaaatt catttgaagt tctggcagta gaagacacac aaggaat^ga. caatcttgct 6300 

tgtgaaagtc aacd4'bccac ctctgaagaa gtagtggaaa atcctaccat acagaaggaa -6360 

gtcatagagt gtgacgtgaa aactaccgaa gttgtaggca atgtcatact taaaccatca 6420 

gatgaaggtg ttaaagtaac acaagagtta ggtcatgagg atcttatggc tgcttatgtg 6480 

* > 

gaaaacacaa gcattaccat taagaaacct aatgagcttt cactagcctt aggtttaaaa 6540 

acaattgcca ctcatggtat tgctgcaatt aatagtgttc cttggagtaa aattttggct ' 6600 

tatgtcaaac cattcttagg apaagcagca attacaacat caaattgcgc taagagatta 6660 

gcacaacgtg tgtttaacaa t'tatatgcct tatgtgttta cattattgtt ccaattgtgt 6720 

acttttacta- aaagtaccaa ttctagaatt agagcttcac tacctacaac tattgctaaa 6780 

« 

aatagtgtta agagtgttgc taaattatgt ttggatg.ccg gcattaatta tgtgaagtca 6840 

cccaaatttt ctaaattgtt cacaatcgct atgtggctat tgttgttaag tatttgctta 6900 

ggttctctaa tctgtgtaac tgctgctttt ggtgtactct tatctaattt tggtgctcct 6960 

tcttattgta atggcgttag agaattgtat cttaattcgt ctaacgttac tactatggat 7020 

ttctgtgaag gttcttttcc ttgcagcatt tgtttaagtg gat'tagactc ccttgattct 7080 

tatccagctc ttgaaaccat tcaggtgacg atttcatcgt acaagctaga cttgacaatt 7140 

ttaggtctgg ccgctgagtg ggttttggca tatatgttgt tcacaaaatt cttttattta 7200 

ttaggtcttt cagctataat gcaggtgttc, tttggctatt ttgctagtca tttcatcagc 7260 

aattcttggc tcatgtggtt tatcattagt attgtacaaa tggcacccgt ttctgcaatg 7320 

gttaggatgt acatcttctt tgcttctttc tactacatat ggaagagcta tgttcatatc 7380 

atggatggtt gcacctcttc gacttgcatg. atgtgctata agcgcaatcg tgpcacacgc 7440 

gttgagtgta caactattgt taatggcatg aagagatctt tctatgtcta tgcaaatgga . 7500 

ggccgtggct tctgpaagac tcacaattgg aatt'gtctca attgtgacac attttgcact 7560 

ggtagtacat tcattagtga tgaagttgct cgtgatttgt cactccagtt taaaagacca 7620 

atcaacccta ctgaccagtc atcgtatatt gttgatagtg ttgctgtgaa aaatggcgcg 7680 

cttcacctct actttgacaa ggctggtcaa aagacctatg agagacatcc gctctcccat 7740 

tttgtcaatt tagacaattt gagagctaac aacactaaag gttcactgcc tattaatgtc 7800 

atagtttttg atggcaagtc caaatgcgac gagtctgctt ctaagtctgc ttctgtgtac 7860 

tacagtcagc tgatgtgcca acctattctg ttgcttgacc aagctcttgt atcagacgtt 7920 
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* m 

ggagatagta ctgaagtttc cgtt'aagatg tttgatgctt atgtcgacac cttttcagca 7980 

• * *• 

acttttagtg- ttcctatgga .aaaacttaag gcacttgttg ctacagctca cagcgagtta 8040 

gcaaagggtg tagctttaga tggtgtcctt tctacattcg tgtcagct.gc ccgacaaggt * 8100 

gttgttgata ccgatgttga cacaaaggat gttattgaat gtctcaaact ttcacatcac 8160 

tctgacttag aagtgacagg tgacagttgt aacaatttca tgctcaccta 'llaataaggtt 8220 

gaaaacatga cgcccagaga tcttggcgca tgtattgact gtaatgcaag- gcatatcaat 8280 

gqccaagtag caaaaagtca caatgtttca ctcatctgga atgtaaaaga ctacatgtct 8340 

■ 

■ 

ttatctgaac agctgcgtaa acaaattcgt agtgctgcca agaagaacaa catacctttt 8400 

agactaactt gtgctacaac tagacaggtt gtcaatgtca taactactaa aatctca.ctc 8460 

♦ 

aagggtggta agattgttag tacttgtttt aaacttatgc ttaaggccac attattgtgc 8520 

gttcttgctg* cattggtttg ttatatcgtt atgccagtac atacattgtc aatccatgst 8580 

ggttacacaa atgaaatcat tggttacaaa gccattcagg atggtgtcac tcgtgacatc 8640 

* • • 

atttctactg atgattgttt tgcaaataaa catgctggtt ' ttgacgcatg gtttagccag 8700 

• cgtggtggtt catacaaaeia tcfacaaaagc tgccctgtag tagctgctat cattacaaga 8760 
gagattggtt tcatagtgcc tggcttaccg ggt-actgtgc tgagagcaat caatggtgac • '8820 

ttcttgcatt ttctacctcg tgtttttagt gctgttggca acatttgcta cacaccttcc. 888.0 

aaactcattg agtatagtga ttttgctacc tctgcttgcg ttcttgctgc tgagtgtaca 8940 

• * ■ 

atttttaagg atgctatggg caaacctgtg ccatattgtt atgacactaa tttgctagag 9000 

ggttctattt cttatagtga gcttcgtcca gacactcgtt atgtgcttat ggatggttcc 9060 

^tcatacagt ttcctaacac ttacctggag ggttctgtta gagtagtaac aacttttga.t 9120 

gctgagtact gtagacatgg tacatgcgaa aggtcagaag taggtatttg cctatctacc 9180 

agtggtagat gggttcttaa taatgagcat tacagagctc tatcaggagt tttctgtggt • 9240 

gttgatgcga tgaatctcat agctaacatc tttactcctc ttgtgcaacc tgtgggtgct 9300 

ttagatgtgt ctgcttcagt agtggctggt ggtattattg ccatattggt gacttgtgct 9360 

• ■ • 

gcctactact ttatgaaatt cagacgtgtt tttggtgagt acaaccatgt tgttgctgct . 9420 

« 

aatgcadttt tgtttttgat gtctttcact atactctgtc tggftaccagc ttacagcttt 9480 

* 

ctgccgggag tctactcagt cttttacttg tacttgacat tctatttcac caatgatgtt . 9540 

tcattcttgg ctcaccttca atggtttgcc atgttttctc ctattgtgcc tttttggata 9600 

acagcaatct atgtattctg tatttctctg aagcactgcc attggttctt taacaactat 9660 

cttaggaaaa gagtcatgtt taatggagtt acatttagta ccttcgagga ggctgctttg 9720 

tgtacctttt tgctcaacaa ggaaatgtac ctaaaattgc gtagcgagac actgttgcca 9780 

♦ 

22 



wo 2004/096842 ' PCT/CA2004/000626 

cttacacagt ataacaggta tcttgctcta tataacaagt acaagtattt cagtggagcc 9840 ' 

ttagatacta ccagctatcg tgaagcagct tgctgccact tagcaaaggc tctaaatgac 9900 

.• 

tttagcaact caggtgctga tgttctctac caaccaccac aga'catcdat cacttctgct 9960 

gttctgcaga gtg^ttttag gaaaatggca ttccQgtcag gcaaagttga agggtgcatg 10020 ' 

■ 

gtacaagtaa cctgtggaac tacaactctt aatggattgt ggttggatga cacagtatac 10080 

tgtccaagac atgtcatttg cacagcagaa gacatgctta atcctaacta tgaagatctg . 10140 

ctcattcgca aatccaacca tagctttctt gttcaggctg gcaatgttca aQttcgtgtt 10200 

attggccatt ctatgcaaaa ttgtctgctt aggcttaaag ttgatacttc taaccctaag 10260 

• acacccaagt ataaatttgt ccgtatccaa cctggtcaaa cattttcagt tctagcatgc 10320 

tacaatggtt caccatctgg tgtttatcag tgtgccatga gacctaatca taccattaaa 10380 

I 

9 

ggttctttcc. ttaatggatc atgtggtagt' gttggtttta acattgatta tgattgcgtg .10440 

tctttctgct atatgcatca tatggagctt ccaacaggag tacacgctgg tactgactta 10500 

• gaaggtaaat tctatggtcc atttgttgac agacaaactg cacaggctgc aggtacagac 10560 
acaaccataa cattaaatgt tttggcatgg ctgtatgctg cjtgttatcaa tggtgatagg 10620 . 
tggtttctta atagattcac cactactttg aatgacttta accttgtggc aatgaagtac 10680 
aactatgaac ctttgacaca agatcatgtt gacatattgg gacctctttc tgctcaaaca 10740' 

■ 

ggaattgccg tcttagatat gtgtgctgct ttgaaagagc tgctgcagaa tggtatgaat 10800 ' 

ggtcgtacta tccttgg^ag cactatttta gaagatga'gt ttacaccatt tgatgttgtt 10860 

agacaatgct ctggtgttac cttccaaggt aagttcaaga aaattgttaa gggcactcat 10920 

» 

cattggatgc ttttaacttt* cttgacatca* ctattgattc ttgttcaaag tacacagtgg 10980^ 

ff 

• tcactgtttt tctttgttta cgagaatgct ttcttgccat ttactcttgg tattatggca 11040 
attgctgcat gtgctatgct gcttgttaag cataagcacg cattcttgtg cttgtttctg 11100 
ttaccttctc ttgcaacagt tgcttacttt aatatggtct acatgcctgc tagctgggtg 11160 

« 

atgcgtatca tgacatggct tgaattggct gacactagct tgtctggtta taggcttaag 11220 

gattgtgtta tgtatgcttc agctttagtt ttgcttattc tcatgacagc tcgcactgtt 11280 

tatgatgatg ctgctagacg tgtttggaca ctgatgaatg tcattacact tgtttacaaa 11340 

gtctactatg gtaatgcttt agatcaagct atttccatgt gggccttagt tatttctgta 11400 

acctctaact attctggtgt cgttacgact atcatgtttt tagctagagc tatagtgttt 11460 

gtgtgtgttg agtattaccc attgttattt attactggca acaccttaca gtgtatcatg 11520 

cttgtttatt gtttcttagg ctattgttgc tgctgctact ttggcctttt ctgtttactc 11580 
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♦ 

■ 

aaccgttact tcaggcttac tcttggtgtt tatgactact tggtctctac acaagaattt .11640 

aggtatatga actcccaggg gcttttgcct cctaagagta gtattgatgc tttcaagctt 11-700 

• • • , 

aacattaagt tgttgggtat tggaggtaaa cc^tgtatca aggttgctac tgtacagtct 11760 

• • • 

aaaatgtctg acgtaaagtg cacatctgtg gtactgctct cggttcttca acaacttaga 11820 

gtagagtcat cttctaaatt. gtgggcacaa tgtgtacaac tccacaatga [tattcttctt 11880. 

gcaaaagaca caactgaagc tttcgagaag atggtttctc ttttgtctgt tttgctatcc 11940 

■ 

atgcagggtg ctgtagacat taataggttg tgcgaggaaa .tgctcgataa ccgtgctact 12000 

cttcaggcta ttgcttcaga atttagttct ttaccatcat atgccgctta tgccactgcc 12060 

caggaggcct atgagcaggc tgtagctaat ggtgattctg aagtcgttct caaaaagtta 1212D 

• ■ 

aagaaatctt tgaatgtggc taaatctgag. tttgaccgtg atgctgccat gcaacgcaag 12180 

ttggaaaaga tggcagatca ggctatgacc caaatgtaca aacaggcaag atctgaggac ,12240 

■ 

i • • 

: 

aagagggcaa aagtaactag tgctatgcaa acaatgctct tcactatgct taggaagctt 12300 

gataatgatg cacttaacaa cattatcaac aatgcgcgtg atggttgtgt tccactcaac 12360 
atcataccat tgactacagc agccaaactc atggttgttg tccctgatta • tggtacctac * 12420 

aagaacactt gtgatggtaa cacctttaca tatgcatctg cactctggga aatccagcaa 124p0 

» * 

< 

gttgttgatg.cggatagcaa gattgttcaa -cttagtgaaa ttaacatgga caattcacca 125.40 

aatttggctt ggcctcttat tgttacagct ctaagagcca actcagctgt taaactacag 12600 

* » 

aataatgaac tgagtccagt agcactacga cagatgtcct gtgcggctgg) taccacacaa 12660 

■ 

acagcttgta ctgatgacaa tgcacttgcc tactataaca'attcgaaggg aggtaggttt 12720 

gtgctggcat tactatcaga ccaccaagat ctcaaatggg ctagattccc taagagtgat 12780 

■ • 

ggtacaggta caatttacac agaactggaa ccaccttgta ggtttgttac agacacacca 12840 

aaagggccta aagtgaaata cttgtacttc atcaaaggct taaacaacct aaatagaggt 12900 

atggtgctgg gcagtttagc tgctacagta cgtcttcagg ctggaaatgc tacagaagta 12960 

cctgccaatt caactgtgct ttccttctgt gcttttgcag • tagaccctgc taaagcatat 13020 

■ m 

aaggattacc tagcaagtgg aggacaacca atcaccaact gtgtgaagat gttgtgtaca 13080 

cacactggta caggacaggc aattactgta acaccagaag ctaacatgga ccaagagtcc . 13140 

m 

tttggtggtg cttcatgttg tctgtattgt agatgccaca ttgaccatcc aaatcctaaa 13200 

ggattctgtg acttgaaagg taagtacgtc caaataccta ccacttgtgc taatgaccca 13260 

gtgggtttta cacttagaaa cacagtctgt accgtctgcg gaatgtggaa aggttatggc 13320 

tgtagttgtg accaactccg cgaacccttg atgcagtctg cggatgcatc aacgttttta 13380 

aacgggtttg cggtgtaagt gcagcccgtc ttacaccgtg cggcacaggc actagtactg 13440 
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atgtcgtcta cagggctttt gatatttaca acgaaaaagt- tgctggtttt gcaaagttcc 13500 

taaaaactaa ttgctgtcgc ttccaggaga aggatgagga aggcaattta ttagactett 13560 

actttgtagt taagaggcat actatgtcta actaccaaca tgaagagact. atttataact 13620 

tggtt'aaaga ttgliccagcg gttgctgtcc atgacttttt caagtttaga gtagatggtg 13680 • 

4 

acatggtacC acatatatca cgtcagcgtc taactaaata cacaatggct gatttagtct 13740 
atgctctacg tcattttgat gagggtaatt gtgatacjatt aaaagaaata ctcgtcacat . 13800 

■ 

acaattgctg tgatgatgat tatttcaata agaaggattg gtatgacttc gt.agagaatc 13860 

ctgacatctt acgcgtatat gctaacttag gtgagcgtgt acgccaatca ttattaaaga 13920 

ctgtacaatt ctgcgatgct atgcgtgatg caggcattgt aggcgtactg acattagata 13980 

atcaggatct taatgggaac tggtacgatt tcggtgattt cgtacaagta gcaccaggct 14040 

gcggagttcc tattgtggat tcatattact cat t get gat g'cccatcctc actttgacta 14100 

gggcattggc tgctgagtcc catatggatg ctgatctcgc aaaaccactt attaagtggg 14160. 

atttgctgaa atatgatttt acggaagaga gactttgtct cttcgaccgt tattttaaat 14220 

attgggacca gacataccat cccaattgta ttaactgttt ggatgatagg tgtatccttc 14280 

» 

attgtgcaaa ctttaatgtg ttattttcta ctgtgtttcc acctacaagt tttggaccac 14340 

tagtaagaaa aatatttgta gatggtgttc cttttgttgt ttcaactgga taccattttc . 14400 

gtgagttagg agtcgtacat aatcaggatg taaacttaca tagctcgcgt ctcagtttca 14460 

aggaactttt agtgtatgct gctgatccag ctatgcatgc agcttctggc aatttattgc 14520 

tagataaacg. cactacatgc ttttcagtag ctgcactaac aaacaatgtt gcttttcaaa 14580 

ctgtcaaacc cggtaatttt aataaagact tttatgactt tgctgtgtct aaaggtttct 14640 

ttaaggaagg aagttctgtt gaactaaaac acttcttctt tgctcaggat ggcaacgctg 14700 

ctatcagtga ttatgactat tatcgttata atctgccaac aatgtgtgat atcagacaac 14760 

tcctattcgt agttgaagtt gttgataaat actttgattg ttacgatggt ggctgtatta 14820 

atgccaacca agtaatcgtt aacaatctgg at.aaatcagc tggtttccca ' tttaataaat 14880 

ggggtaaggc tagactttat tatgactcaa tgagttatga ggatcaagdt gcacttttcg 14940 

cgtatactaa gcgtaatgtc atccctacta taactcaaat gaatcttaag tatgccatta 15000 

gtgcaaagaa tagagctcgc accgtagctg gtgtctctat ctgtagtact atgacaaata 15060 

gacagtttca tcagaaatta ttgaagtcaa tagccgccac tagaggagct actgtggtaa 15120 

ttggaacaag caagttttac ggtggctggc ataatatgtt aaaaactgtt tacagtgatg 15180 

tagaaactcc acaccttatg ggttgggatt atccaaaatg tgacagagcc atgcctaaca 15240 
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■ 

tgcttaggat aatggcctct cttgttcttg ctcgcaaaca taacacttgc tgtaacttat 15300 

cacaccgttt ctacaggtta gctaacgagt gtgcgcaagt attaagtgag e^tggtcatgt 15360 

gtggcggctc actatatgtt aaaccaggtg ga^catcatc cggtgatgct acaactgctt 15420 

atgctaatag tgtctttaac atttgtcaag ctgttacagc caatgtaaat gcacttcttt 15480 

caactgatgg taataagata gctgacaagt atgtccgcaa tctacaacac ^ggctctatg 15540 

agtgtctcta tagaaatagg gatgttgatc atgaattcgt ggatgagttt tacgcttacc 15600 

tgcgtaaaca tttctccatg atgattcttt ctgatgatgc cgttgtgtgc tataacagta 15660 

actatgcggc tcaaggttta gtagctagca ttaagaactt taaggcagtt ctttattatc 15120 

aaaataatgt gttcatgtct gaggcaaaat gttggactga gactgacctt actaaaggac 15780 

.ctcacgaatt ttgctcacag catacaatgc. tagttaaaca "aggagatgat tacgtgtacc 15840 

tgccttaccc agatccatca agaatattag gcgcaggctg ttttgtcgat gatattgtca 15900 

• ■ 

aaacagatgg- tacacttat.g attgaaaggt tcgtgtcact ggctattgat gcttacccac 15960 

ttacaaaaca tcctaatcag gagtatgctg atgtctttca cttgtattta caatacatta 16020 

gaaagttaoa tgatgagct't actggccaca tgttggacat gtattccgta atgctaacta* 16080 

atgataacac ctcacggtac tgggaacctg agttttatga ggctatgtac acaccacata I6I4O 

cagtcttgca ggctgtaggt gcttgtgtat tgtgcaattc acagacttca cttcgttgcg 16200 

gtgcctgtat tag.gagacca ttcctatgtt gcaagtgctg ctatgaccat gtcatttfcaa 16260 

catcacacaa attagtgttg tctgttaatc cotatgtttg caatgcccca ggttgtgatg 16320 

tcactgatgt gacacaactg tatctaggag gtatgagcta ttattgcaag tcacataagc 16380 

ctcccattag. ttttccatta tgtgctaatg gtcaggtttt tggtttatac aaaaacacat 16440 

gtgtaggcag tgacaatgtc actgacttca atgcgatagc aacatgtgat tggactaatg 16500 

* 

ctggcgatta catacttgcc aacacttgta ctgagagact caagcttttc gcagcagaaa 16560 

■ « ' ■ 

cgctcaaagc cactgaggaa acatttaagc tgtcatatgg tattgccact gtacgcgaag 16620 

tactctctga cagagaattg catctttcat gggaggttgg aaaacctaga ccaccattga 16680 

acagaaacta tgtctttact ggttaccgtg taactaaaaa tagtaaagta cagattggag 16740 

agtacacctt tgaaaaaggt gactatggtg atgctgttgt gtacagaggt actacgacat 16800 

acaagttgaa tgttggtgat tactttgtgt tgacatctca cactgtaatg ccacttagtg 16860 

cacctactct agtgccacaa gagcactatg tgagaattac tggcttgtac ccaacactca 16920 

acatctcaga tgagttttct agcaatgttg caaattatca aaaggtcggc atgcaaaagt 16980 

actctacact ccaaggacca cctggtactg gtaagagtca ttttgccatc ggacttgctc 17040 

tctattaccc atctgctcgc atagtgtata cggcatgctc tcatgcagct gttgatgccc 17100 

26 



wo 2004/096842 . ' PCT/CA2004/000626 

• • • 

t 

tatgtgaaaa ggcattaaaa ta'tttgccca tagataaatg tagtagaatc atacctgcgc 17160 

* * . • 

gtgcgcgcgt agagtgtttt gataaattca aagtgaattc aacactagaa cagtatgttt 17220 

tctgcactgt aaatgcattg ccagaaacaa ctgctgacat tgtagtcttt gatgaaatct 17280 

ctatggctac taat'tatgac ttgagtgttg tcaatgctag acttcgtgca aaacact^cg 17340 

« 

tctatattgg' cgatcctgct caattaccag ccccccgcac attgctgact aaaggcacac 17400 

tagaaccaga atattttaat tcagtgtgca gacttatgaa aacaataggt ccagacatgt 17460 

• ' • 

tccttggaac ttgtcgccgt tgtcctgctg aaattgttga cactgtgagt gcjittagttt 17520 

atgacaataa gctaaaagca cacaaggata agtcagctca atgcttcaaa atgttctaca 17580 

aaggtgttat tacacatgat gtttcatctg caatcaacag acctcaaata ggcgttgtaa 17640 

• « 

4 

gagaatttct' tacacgcaat cctgcttgga gaaaagctgt ttttatctca cctt.ataatt 17700 

I 

% 

I 

cacagaacgc tgtagcttca' aaaatcttag gattgcctac gcagactgtt gattcatcac 17760 

agggttctga atatgactat gtcatattca cacaaactac tgaaacagca cactcttgta 17820- 

> t 

« 

atgtcaaccg cttcaatgtg gctatcacaa gggcaaaaat tggcatttt'g tgcataatgt 17880 

■ 

ctgatagaga tctttatgac aaactgcaat ttacaagtct agaaatacca cgtcgcaatg 17940 

tggctacatt acaagcagaa aatgtaactg gactttttaa ggactgtagt aagatcatta 18000 

■ • • ' 

ctgg):cttca tcctacacag gca.cctacac acctcagcgt tgatat^aag ttcaagactg * 18060 

aaggattatg tgttgacata ccaggcatac caaagg^pat ga'cctaccgt agactcatct • 16120 

• . . . 

ctatgatggg tttcaaa^tg aattaccaag tcaatggtta ccctaatatg tttatcaccc 18180 

gcgaagaagc tattcgtcac gttcgtgcgt ggattggctt tgatgtagag ggctgtcatg 18240 

caactagaga tgctgtgggt actaacctac ctctccagct aggattttct acaggtgtta 18300 

acttagtagc tgtaccgact ggttatgttg acactgaaaa taacacagaa ttcaccagag 18360 

ttaatgcaaa acctccacca ggtgaccagt ttaaacatct tatacc'actc atgtataaag 18420 

gcttgccctg gaatgtagtg cgtattaaga tagtac'aaat gctcagtgat acactgaaag 18480 

• * ■ 

gattgtcaga cagagtcgtg ttcgtccttt gggcgcatgg ctttgagctt acatcaatga 18540 

agtactttgt caagattgga cctgaaagaa' cgtgttgtct gtgtgacaaa cgtgcaactt 18600 

gcttttctac ttcatcagat acttatgcct gctggaatca ttctgtgggt tttgactatg 18660 

tctataaccc atttatgatt gatgttcagp agtggggctt tacgggtaac cttcagagta 18720 

accatgacca acattgccag gtacatggaa atgcacatgt ggctagttgt gatgctatca 18780 

tgactagatg tttagcagtc catgagtgct ttgtteagcg cgttgattgg tctgttgaat 18840 

accctattat aggagatgaa ctgagggtta attctgcttg cagaaaagta caacacatgg 18900 
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ttgtgaagtc tgcattgctt gctgataagt ttccagttct tcatgacatt ggaaatccaa 18960 

aggctatcaa gtgtgtgcct caggctgaag tagaatggaa gttctacgat gctcagccat 19020 

- » • " 

gtagtgacaa agcttacaaa atagagggac- tct^tctattc ttatgctaca catcacgata 19080 

.... . • ^ 

aattca'ctga tggtgtttgt ttgttttgga attgtaacgt tgatcgttac. ccagccaatg 19140 

caattgtgtg taggtttgac acaagagtct tgtcaaactt gaacttacca |ggctgtgatg 19200 

gtggtagttt gtatgtgaat aagcatgcat tccacactcc agctttcgat aaaagtgcat 19260 

• « 

ttactaattt aaagcaattg cctttctttt actattctga .tagtccttgt gagtctcatg 19320 

gcaaacaagt agtgtcggat attgattatg ttccactcaa atctgctacg tgtattacac 19380 

gatgcaattt aggtggtgct gtttgcagac accatgcaaai tgagtaccga cagtacttgg 19440 

4 

I 

atgcatataa tatgatgatt tctgctggat ttagcctatg gatttacaaa caatttgata 19500 

cttataacct gtggaataca tttaccaggt tacagagttt agaaaatgtg gcttataatg 19560 

• * * 

ttgttaataa- aggacacttt gatggacacg ccggcgaagc acctgtttcc atcattaata 19620 

atgctgttta cacaaaggta gatggtattg atgtggagat ctttgaaaat aagacaacac 19680 

■ 

ttcctgttaa tgttgcattt gagctttggg ctaagcgtaa cattaaacca gtgccagaga * 19740 

■ * 

* 

ttaagatact caataatttg ggtgttgata tcgctgctaa tactgtaatc tgggactaca 19800 

« 

aaagagaagc cccagcacat gtatctacaa tiaggtgtctg cacaatgact gacattgcca 19860 

agaaacctac tgagagtgct tgttcttcac ttactgtctt gtttgatggt agagtggaag 19920 

gacaggtaga cctttttaga aacgcccgta atggtgtttt aataacagaa ggttcagtca 19980 

■ 

aaggtctaac accttcaaag ggaccagcac aagctagcgt- caatggagtc acattaattg 20040 

gagaatcagt aaaaacacag tttaactafct ttaagaaagt agacggcatt attcaacagt 20100 

tgcctgaaac ctactttact cagagcagag acttagagga ttttaagccc agatcacaaa .20160 

tggaaactga ctfctctcgag ctcgctatgg atgaatt'cat acagcgatat aagctcgagig 20220 

gctatgcctt cgaacacatc gtttatggag atttcagtca tggacaactt ggcggtcttc 20280 

atttaatgat aggcttagcc aagcgctcac aagattcacc acttaaatta gaggatttta 20340 

tccctatgga cagcacagtg aaaaattact tcataacaga tgcgcaaaca ggttcatcaa 20400 

aatgtgtgtg ttctgtgatt gatcttttac ttgatgactt tgtcgagata ataaagtcac 20460 

aagatttgtc agtgatttca aaagtggtca aggttacaat tgactatgct gaaatttcat 20520 

tcatgctttg gtgtaaggat ggacatgttg aaaccttcta cccaaaacta caagcaagtc 20580 

aagcgtggca accaggtgtt gcgatgccta acttgtacaa gatgcaaaga atgcttcttg 20640 

aaaagtgtga ccttcagaat tatggtgaaa atgctgttat accaaaagga ataatgatga 20100 

atgtcgcaaa gtatactcaa ctgtgtcaat acttaaatac acttacttta gctgtaccct 20760 
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* I * • 

acaacatgag agttattcac tttggtgctg gctctgataa- aggagttgca ccaggtacag 20820 

• - . 

ctgtgctcag acaatggttg ccaactggca cactacttgt cgattcagat cttaatgact 20880 

tcgtctccga cgcagattct actttaattg gagactgtgc aacagtacat acggctaata 20940 

aatgggacct tatiattagc gatatgtatg accctaggac caaacatgtg acaaaiagaga 21000 

• , . # • 

• * 

atgactctaa 'agaagggttt ttcacttatc tgtgtggatt tataaagcaa aaactagccc 21060 

tgggtggttc tatagctgta aagataacag agcattcttg gaatgctgac ctttacaagc 21120 

. • ' 

ttatgggcca tttctcatgg tggacagctt ttgttacaaa tgtaaatgca tpatcatcgg 21180 

• • * • 

aagcattttt aattggggct aactatcttg gcaagccgaa ggaacaaatt gatggctata 21240 

ccatgcatgc taactacatt ttctggagga acacaaatcc tatccagttg tcttcctatt 21300 

• ■ 

cactctttga catgagcaaa tttcctctta aattaagagg aactgctgta. atgtptctta 21360 

aggagaatca aatcaatgat atgatttatt ctcttctgga aaaaggtagg cttatcatta 21420 

gagaaaacaa cagagttgtg gtttcaagtg atattcttgt taacaactaa acgaacatgt 21480 

ttattttctt attatttctt actctcacta gtggtagtga ccttgaccgg tgcaccactt 21S40 

■ 

ttgatgatgt tcaagctcct aattacactc aacatacttc atctatgagg ggggtttact 21600 

atcctgatga aatttttaga tcagacactc tttatttaac t'caggattta tttcttccat 21660 

tttaVtctaa tgttacaggg tttqatacta ttaatcatac gtttggcaac cctgtcatac 21720 

cttttaagga tggtatttat tttgctgcca cagagaaatc aaatgttgtc cgtggttggg ' 21780 

tttttggttc taccatg.aac aacaagtcac agtcggtgat tattattaac aattctacta 21840 

atgttgttat acgagcatgt aactttgaat tgtgtgacaa ccctttcttt. gctgtttcta 21900 

ff 

aacccatggg tacacagaca catactatga tattcgataa tgcatttaat tgcactttcg 21960 

w 

agtacatatc tgatgccttt tcgcttgatg tttcagaaaa gtcaggtaat tttaaacact 22020 

tacgagagtt tgtgtttaaa aataaagatg ggtttctcta tgtttataag ggctatcaac 22080 

ctatagatgt agttcgtgat ctaccttctg gttttaacac tttgaaacct atttttaagt 22140 

■ 

tgcctcttgg tattaacatt acaaatttta gagccattct tacagccttt tcacctgctc 22200 

aagacatttg gggcacgtca gctgcagcct attttgttgg .ctatttaaag ccaactacat 22260 

ttatgctcaa gtatgatgaa aatggtacaa tcacagatgc tgttgattgt tctcaaaatc 22320 

cacttgctga actcaaatgc tctgttaaga gctttgagat tgacaaagga atttaccaga 22380 

cctctaattt cagggttgtt ccctcaggag atgttgtgag attccctaat attacaaact 22440 

tgtgtccttt tggagaggtt tttaatgcta ctaaattccc ttctgtctat gcatgggaga 22500 

gaaaaaaaat ttctaattgt gttgctgatt actctgtgct ctacaactca acattttttt 22560 
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* B 

caacctttaa gtgctatggcgtttctgcca.ctaagttgaa tgatctttgc ttctccaatg 22620 

tctatgcaga- ttcttttgta .gtcaagggag atgatgtaag acaaatagcg ccaggacaaa ; 22680 

ctggtgttat tgctgattat aattataaat tgccagatga tttcatgggt tgtgtccttg '22740 

» 

cttggaatac taggaacatt gatgctactt caactggtaa ttataattat aaatataggt* 2280Q 

atcttagaca tggcaagctt aggccctttg agagagacat atctaatgtg. |cctttctc'cc 22860 

. • • • . . 

ctgatggcaa accttgcacc ccacctgctc ttaattgtta ttggccatta- aatgattatg 22920 

gtttttacac cactactggc attggctacc aaccttacag agttgtagta ctttcttttg . 22980 

« 

aacttttaaa tgcaccggcc acggtttgtg gaccaaaatt atccactgac cttattaaga 23040 

« 

accagtgtgt caattttaat tttaatggac tcactggtac tggtgtgtta actccttctt 23100 

caaagagatt tcaaccattt caacaatttg gccgtgatgt ttctgatttc actgattccg 231.60 

ttcgagatcc taaaacatct gaaatattag acatttcacc ttgcgctttt gggggtgtaa 23220 

gtgtaattac acctggaaca aatgcttcat ctgaagttgc tgttctatat caagatgtta 23280 

actgcactga tgtttctaca gcaattcatg cagatcaact cacaccagct tggcgcatat 23340 

■ attctactgg aaacaatgta ttccagactc aagcaggctg tcttatagga gctgagcatg 23400 

tcgacadttc ttatgagtgc gacattccta ttggagctgg catttgtgct agttaccata * 23460 

• # 

m • 

i 

cagtttcttt attacgtagt actagccaaa aatctattgt ggcttatact atgtctttag 23520 

gtgctgatag ttcaattgct tactctaata acaccattgc tatacctact aacttttcaa 23580 

ttagcattac tacagaagta atgcctgttt ctatggctaa aacctccgta gattgtaata -23640 

tgtacatctg cggagattct actgaatgtg ctaatttgct tctccaatat ggtagctttt 23700 

m 

gcacacaact aaatcgtgca ctctcaggta ttgctgctga acaggatcgc aacacacgtg 23760 
aagtgttcgc tcaagtcaaa caaatgtaca aaaccccaac tttgaaatat tttggtggtt 23820 

■ 

ttaatttttc a.caaatatta cctgaccctc taaagccaac taagaggtct tttattgagg • 23880 

acttgctctt taataaggtg acactcgctg atgctggctt catgaagcaa tatggcgaat 23940 

gcctaggtga tattaatgct agagatctca tttgtgcgca gaagttcaat ggacttacag 24000 

tgttgccacc tctgctcact gatgatatga ttgctgccta cactgctgct ctagttagtg . 24060 

gtactgccac tgctggatgg acatttggtg ctggcgctgc tcttcaaata ccttttgcta 24120 

tgcaaatggc atataggttc aatggcattg gagttaccca aaatgttctc tatgagaacc .24180 

aaaaacaaat cgccaaccaa tttaacaagg cgattagtca aattcaagaa tcacttacaa 24240 

caacatcaac tgcattgggc aagctgcaag acgttgttaa ccagaatgct caagcattaa 24300 

acacacttgt taaacaactt agctctaatt ttggtgcaat ttcaagtgtg ctaaatgata 24360 

tcctttcgcg acttgataaa gtcgaggcgg aggtacaaat tgacaggtta attacaggca 24420 
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gacttcaaag ccttcaaacc tatgtaacac aacaactaat caggg&tgct gaaatcaggg 24480 

cttctgctaa tcttgctgct actaaaatgt ctgagtgtgt tcttggacaa tcaaaaagag 24540 

ttgacttttg tggaaagggc taccacctta tgtccttccc acaagcagcc ccgcatggtg 24600 

ttgtcttcct acatgtcacg tatgtgccat cccaggagag gaacttcacc acagcgccag 24660 

> 

caatttgtca tgaaggcaaa gcatacttcc ctcgtgaagg tgtttttgtg tttaatggca 24720 

cttcttggtt tattacacag aggaacttct tttctccaca aataattact acagacaata . 24780 

catttgtctc aggaaattgt gatgtcgtta ttggcatcat taacaacaca gtttatgatc 24840 

ctctgcaacc tgagct'tgac tcattcaaag aagagctgga caagtacttc aaaaatcata 24900 

catcadcaga tgttgatctt ggcgacattt caggcattaa cgcttctgtc gtcaacattc 24960 

aaaaagaaat tgaccgcctc aatgaggtcg* ctaaaaattt aaatgaatca ctcattgacc . 25020 

ttcaagaatt gggaaaatat gagcaatata ttaaatggcc ttggtatgtt. tggctcggct 25080 

tcattgctgg actaattgcc atcgtcatgg ttacaatctt gctttgttgc atgactagtt 25140 

t 

gttgcagttg cctcaagggt gcatgctctt gtggttcttg ctgcaagttt gatgaggatg 25200 

actctgagcc agttctcaag ggtgtcaaat tacattacac ataaacgaac ttatggattt 25260 

gtttatgaga ttttttactc ttagatcaat tactgcacag ccagtaaaaa ttgacaatgc 25320' 

• - • ^ - 

ttctcctgca agtactgttc atgctacagc aacgataccg ctacaagcct cactcccttt 25380 

cggatggctt gttattggcg ttgcatttct tgctgttttt cagagcgcta ccaaaataat 25440 

I 

* ft « » A 

tgcgctcaat aaaagatggc agctagccct ttataagggc ttccagttca tttgcaattt 25500 

actgctgcta tttgttacca tctattcaca tcttttgctt gtcgctgcag gtatggaggc 25560 

gcaatttttg tacctctatg ccttgatata ttttctacaa tgcatcaacg catgtagaat 25620 

tattatgaga tgttggcttt gttggaagtg caaatccaag aacccattac tttatgatgc 25680 

caactacttt gtttgctggc acacacataa. ctatgactac tgtataccdt ataacagtgt 25740 

* 

cacagataca attgtcgtta ctgaaggtga cggcatttca acaccaaaac tcaaagaaga 25800 

ctaccaaatt ggtggttatt ctgaggatag gcactcaggt gttaaagact atgtcgttgt 25860 

acatggctat ttcaccgaag tttactacca gcttgagtct acacaaatta ctacagacac 25920 

* 

tggtattgaa aatgctacat tcttcatctt taacaagctt gttaaagacc caccgaatgt 25980 

gcaaatacac acaatcgacg gctcttcagg agttgctaat ccagcaatgg atccaattta 26040 

tgatgagccg acgacgacta ctagcgtgcc tttgtaagca caagaaagtg agtacgaact 26100 

tatgtactca ttcgtttcgg aagaaacagg tacgttaata gttaatagcg tacttctttt 26160 

tcttgctttc gtggtattct tgctagtcac actagccatc cttactgcgc ttcgattgtg 26220 
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tgcgtactgc tgcaatattg- ttaacgtgag. tttagtaaaa ccaacggttt acgtctactc 26280 

gcgtgttaaa aatctgaact cttctgaagg agttcctgat cttctggtct aaacgaacta 26340 

* « 

actattatta ttattctgtt tggaacttta ac^ttgctta tcatggoeiga caacggtact * 26400 

attaccgttg aggagcttaa acaactcctg gaacaatgga acctagtaat aggtttccta' 26460 

ttcctagcct ggattatgtt actacaattt gcctattcta atcggaacag gjtttttgtac 26520 

« 

• • • 

ataataaagc ttgttttcct ctggctcttg tggccagtaa cacttgcttg* ttttgtgctt 26580 

* 

gctgctgtct acagaattaa ttgggtgact ggcgggattg cgattgcaat ggcttgtatt 26640 

gtaggcttga tg'tggcttag ctacttcgtt gcttccttca ggctgtttgc tcgtacccgc 2670.0 

tcaatgtggt cattcaaccc agaaacaaac attcttctca atgtgcctct ccgggggaca 26760 

attgtgacca gaccgctcat ggaaagtgaa cttgtcattg gtgctgtgat cattcgtggt 268^0 

cacttgcgaa tggccggaca ctccctaggg cgctgtgaca ttaaggacct gccaaaagag 26880 

« 

• • • 

atcactgtgg ctacatcacg aacgctttct tattacaaat taggagcgtc gcagcgtgta 26940 

■ 

ggcactgatt caggttttgc tgcatacaac cgctaccgta ttggaaacta taaattaaat 27000 

acagaccacg ccggtagcaa cgacaata.tt gctttgctag tacagtaagt gacaacagat 27060 

gtttcatctt gttgacttcc aggttacaat agoagagata ttgattatca ttatgaggac ♦ 27120 

tttca^gatt gctatttgga atcttgacgt tataataagt tpaatagtga gacaattatt. 27180 

taagcctcta actaagaaga attattcgga gttagatgat gaagaaccta tggagttaga 27240 

ttatccataa aacgaacatg aaaattattc tcttcctgac attgattgta tttacatctt 27300 

gcgagctata tcactatcag gagtgtgtta gaggtacgac tgtactacta aaagaacctt 27360 

gcccatcagg aacatacgag, ggcaattcac catttcaccc tcttgctgac aataaatttg 27420 • 

cactaacttg cactagcaca cactttgctt ttgcttgtgc tgacggtact cgacatacct.- 27480 

* 

• ■ 

atcagctgcg tgcaagatca gtttcAccaa aacttttcat cagacaagag gaggttcaac 27540 

aagagctcta. ctcgccactt tttctcattg ttgctgctct agtattttta atactttgct 27600 

tcaccattaa gagaaagaca gaatgaatga gctcacttta attgacttct atttgtgctt 27660 

• • • 

tttagccttt ctgctattcc ttgttttaat aatgcttatt atattttggt tttcactcga .27720 

« 

aatccaggat ctagaagaac cttgtaccaa agtctaaacg aacatgaaac ttctcattgt 27780 

• • - . 

tttgacttgt atttctctat gcagttgcat atgcactgta gtacagcgct gtgcatctaa .27840 

taaacctcat gtgcttgaag atccttgtaa ggtacaacac taggggtaat acttatagca 27900 

ctgctt'ggct ttgtgctcta ggaaaggttt taccttttca tagatggcac actatggttc 27960 

aaacatgcac acctaatgtt actatcaact gtcaagatcc agctggtggt gcgcttatag 28020 

* 

ctaggtgttg gtaccttcat gaaggtcacc aaactgctgc atttagagac gtacttgttg 28080 
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* ' ■ 

ttttaaataa- acgaacaaat taakatgtct gataatggac cccaatcaaa ccaacgtagt 28140 

gccccccgca ttacatttgg .tggacccaca gattcaactg acaataacca gaatggagga 28200 

• 

cgcaatgggg caaggccaaa acagcgccga ccccaaggtt tacccaataa tactgcgtct 28260 

* • 

tggttcacag ctctcactca gcatggcaag gaggaactta gattccctcg aggccagggc 28320 

gttccaatca acaccaatag tggtccagat gaccaaattg gctactaccg aagagctacc 28380 . 

cgacgagttc gtggtggtga cggcaaaatg aaagagctca gccccagatg gtacttctat 2*8440 

tacctaggaa ctggcccaga agcttcactt ccctacggcg ctaacaaaga aggcatcgta 28500 

tgggttgcaa ctgagggagc cttgaataca cccaaagacc acattggpac ccgcaatcct 28560 

■ 

I 

aataacaatg ctgccaccgt gctacaactt cctcaaggaa caacattgcc aaaaggcttc 28620 

■ 

tacgcagagg gaagcagagg cggcagtcaa gcctcttctc gctcctcatc acgtagtcgc -2868.0 
ggtaattcaa. gaa^ttca^c tcctggcagc agtaggggaa attctcctgc tcgaatggct .28*740 

agcggaggtg gtgaaactgc cctcgcgcta ttgctgctag acagattgaa ccagcttgag 28800 

' agcaaagttt ctggtaaagg ccaacaacaa caaggccaaa ctgtcactaa gaaatctgct 28860 

■ 

gctgaggcat ctaaaaagcc tcgccaaaaa cgtactgcca caaaacagta caacgtcact 28920 

• caagcatttg ggagacgtgg tccagaacaa acccaaggaa atttcgggga ccaagaccta 28980 

» - - 

atcagacaag gaactgatta caaacattgg ccgcaaattg cacaatttgc tccaagtgcc 29040 

tctgcattct ttggaatgtc acgcattggc atggaagtca caccttcggg aacatggctg 29100 

» . ■ • 

« ■ 

acttatcatg gagccattaa attggatgac aaagatccac aattcaaaga caacgtcata 29160 

ctgctgaaca .agcacat.tga cgcatacaaa .acattcccac caacagagcc taaaaaggac 29220 

aaaaagaaaa agactgatga agctcagcct ttgccgcaga gacaaaagaa gcagcccact 29280 

gtgactcttc ttcctgcggc tgacatggat gatttctcca gacaacttca aaattccatg 29340 

agtggagctt ctgctgattc aactcaggca. taaacactca tgatgaccac acaaggcaga 29400 

tgggctatgt aaacgttttc gcaattccgt ttacgataca tagtctactc ttgtgcagaa 29460 

tgaattctcg .taactaaaca gcacaagtag gtttagttaa ctttaatctc acatagcaat 29520 

ctttaatcaa tgtgtaacat tagggaggac ttgaaagagc caccacattt tcatcgaggc 29580 

cacgcggagt acgatcgagg gtacagtgaa taatgctagg gagagctgcc tatatggaag 29640 

agccctaatg tgtaaaatta attttagtag tgctatcccc atgtgatttt aatagcttct 29700 

taggagaatg acaaaaaaaa aaaaaaaaaa aaaaaa 29736 

<210> 3 
<211> 26 
<212> DNA 
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<213> Severe acute respiratory syndrome virus 



<400> 3 

tctctaaacg aactttaaaa tctgtg 



26 



<210> 4 

<211>' 16 • • • 

<212> DNA . I • 

<213> • Severe acute respiratory syndrome virus 

* ■ 

<400> 4 

caactaaacg aacatg « . . ' 16 

<210> 5 
<211> 18 - . 
<212> DNA 

<213> Severe acute respiratory syndrome virus 
<400> 5 

cacataaacg aacttatg ' . • . 18 

<210> 6 

<211> 16 

.<i212> DNA . 

<213> Severe acute respiratory syndrome virus 



<210> 7 
<211> 18 
<212> DNA 

<213> Severe acute respiratory syndrome virus 

<400> 7 * • * . 

ggtctaaacg aactaact 18 



<210> 8 

<211> 11 

<212> DNA 

<213> Severe acute respiratory syndrome virus 



<210> 9 • 
<211> 17 
<212> DNA 

<213> Severe acute respiratory syndrome virus 
<400> 9 

tccataaaac gaacatg 17 



<400> 6 

tgagtacgaa cttatg 



16 



<400> 8 
aactataaat t 



11 



<210> 10 
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<211> 24 • . . • 

<212> DNA- ■ -» 

<213> Severe acute respiratory syndrome virus . 
<400> 10 . • 

tgctctagta tttttaatac tttg 24 

■ 

t 

<210> 11 
<211> 16 
<212> DNA' 

♦ « 

<213> Severe acute respiratory syndrome virus 
<400> 11 

agtctaaacg aacatg 16 

4 

<2i0> 12 • 
<211> 15 
<212> DNA 

<213> Severe acute respiratory syndrome virus • 
<400> 12 • • • 

ctaataaacc tcatg X5 

4 

<210> 13 . 

<211> 24 • . 

<212> DNA 

<213> Severe acute respiratory syndrome virus • 

I ' ' . 

<400> 13 

taaataaacg aacaaattaa aatg '24 

<210> 14 .. ' 

-<211> 20 • ' 

<212> PRT • 

<213> Severe acute respiratory syndrome virus 
<400> 14 

* 

Lys Thr Phe Pro Pro Thr Glu Pro Lys Lys Asp Lys Lys Lys Lys Thr 
1 5 . 10 15 

Asp Glu Ala Gin 

20 - 



<210> 15 
<211> 29751 
<212> DNA 

<213> Severe acute respiratory syndrome virus 
<400> 15 

atattaggtt tttacctacc caggaaaagc caaccaacct cgatctcttg tagatctgtt 60 
ctctaaacga actttaaaat ctgtgtagct gtcgctcggc tgcatgccta gtgcacctac 120 
gcagtataaa caataataaa ttttactgtc gttgacaaga aacgagtaac tcgtccctct 180 

• > 
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• * 

tctgcagact "gcttacggtt tcgtccgtgt tgcagtcgat catcagcata cctaggtttc 240 

gtccgggtgt gaccgaaagg taagatggag agccttgttc ttggtgtcaa cgagaaaaca 300 

cacgtccaaa tcagtttgcc tgtccttcag gttagagacg tgctagt^cg .tggcttcggg 360 

* • 

gactctgtgg aagaggccct atcggaggca cgtgaacacc tcaaaaatgg cacttgtggt 420 

I 

•ctagtagagc tggaaaaagg cgtactgccc cagcttgaac agccctatgt gttcattaaa 480 

m • 

cgttctgatg ccttaagcac caatcacggc cacaaggtcg ttgagctggt tgcagaaatg 540 

* 

gacggcattc agtacggtcg tagcggtata acactgggag tactcgtgcc acatgtgggc '600 

a • • 

gaaaccccaa ttgcataccg caatgttctt cttcgtaaga acggtaataa gggagccggt 660 
■ 

• m 

ggtcatagct atggcatcga tctaaagtct tatgacttag gtgacgagct tggcactgat 720 

cccattgaag attatgaaca aaactggaac actaagcatg gcagtggtgc actccgtgaa 780 

ctcactcgtg agctcaatgg aggtgcagtc actcgctatg tcgacaacaa tttctgtggc 840 

ccagatgggt accctcttga ttgcatcaaa gattttctcg cacgcgcggg caagtcaatg 900 

tgcactcttt ccgaacaact tgattacatc gagtcgaaga gaggtgtcta ctgctgccgt 960 

gaccatgagc atgaaattgc ctggttcact gagcgctctg ataagagcta cgagcaccag 1020 

acacccttcg aaattaagag tgccaagaaa tttgacactt tcaaagggga atgcccaaag 1080 

tttgtgtttc ctcttaactc aaaagtcaaa gtcattcaac cacgtgttga aaagaaaaag- 1140 

■ 

actgagggtt tcatggggcg tatacgctct gtgtaccctg ttgcatctcc acagfgagtgt 1200 

aacaatatgc acttgtctac cttgatgaaa tgtaatcatt gcgatgaagt* ttcatggcag 1260 

acgtgcgact ttctgaaagc cacttgtgaa cattgtggca ctgaaaattt agttattgaa • 1320 

ggacctacta catgtgggta cctacctact aatgctgtag tgaaaatgcc atgtcctgcc 1380 

tgtcaagacc cagagattgg acctgagcat agtgttgcag attatcacaa ccactcaaac 1440 

attgaaactc gactccgcaa gggaggtagg. actagatgtt ttggaggctg tgtgtttgcc 1500 

tatgttggct gctataataa gcgtgcctac tgggttcctc gtgctagtgc tgatattggc 1560 

♦ 

tcaggccata ctggcattac tggtgacaat gtggagacct tgaatgagga tctccttgag 1620 

atactgagtc gtgaacgtgt taacattaac attgttggcg attttcattt gaatgaagag 1680 

gttgccatc^ ttttggcatc tttctctgct tctacaagtg cctttattga cactataaag 1740 

agtcttgatt acaagtcttt caaaaccatt gttgagtcct gcggtaacta taaagttacc 1800 

aagggaaagc ccgtaaaagg tgcttggaac attggacaac agagatcagt tttaacacca 1860 

ctgtgtggtt ttccctcaca ggctgctggt gttatcagat- caatttttgc gcgcacactt 1920 

gatgcagcaa accactcaat tcctgatttg caaagagcag ctgtcaccat acttgatggt 1980 
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* 

atttctgaac agtcattacg tcttgtcgac gccatggttt atacttcaga cctgctcacc 2040 

* • 

aacagtgtca ttattatggc atatgtaact ggtggtcttg tacaacagac ttctcagtgg 2100 

♦ 

ttgtctaatc ttttgggcac tactgttgaa aaactcaggc ctatctttga atggattgag ' 2160 

gcgaaactta gtgcaggagt tgaatttctc aaggatgctt gggagattct caaatttctc" 2220 

■ 

attacaggtg tttttgacat cgtcaagggt caaatacagg ttgcttcaga. taacatcaag 2280 

• ■ • 

gattgtgtaa aatgcttcat tgatgttgtt aacaaggcac tcgaaatgtg cattgatcaa 2340 

gtcactatcg ctggcgcaaa gttgcgatca ctcaacttag gtgaagtctt catcgctcaa 2400 

« 

agcaagggac tttaccgtca gtgtatacgt ggcaaggagc agctgcaact actcatgcot 2460 

cttaaggcac caaaagaagt aacctttctt gaaggtgatt cacatgacac agtactt^cc 2520 

tctgaggagg ttgttctcaa gaacggtgaa ctcgaagcac tcgagacgcc cgttgatagc 25B0 

4 

ttcacaaatg gagctatcgt tggcacacca gtctgtgtaa atggcctcat gctcttagag 2640 

attaaggaca aagaacaata ctgcgcattg tctcctggtt tactggctac aaacaatgtc 2700 

tttcgcttaa aagggggtgc accaattaaa ggtgtaacct ttggagaaga tactgtttgg 2760 

gaagttcaag gttacaagaa tgtgagaatc acatttgagc ttgatgaacg tgttgacaaa 2820 

* 

gtgctta'atg aaaagtgctc tgtctacact gttgaatdcg gtaccgaagt tactgagttt ' 2880 

gcatgtgttg tagcagaggc tgttgtgaag actttacaac cagtttctga tctccttacc 2940 

« 

aacatgggta ttgatcttga tgagtggagt gtagctacat tctacttatt tgatgatgct 300*0 

■ 

ggtgaagaaa acttttcatc acgtatgtat tgttcctttt accctccaga tgaggaagaa 3060 

H • 

I 

gaggacgatg cagagtgtga ggaagaagaa attgatgaaa cctgtgaaca tgagtacggt 3120 

^cagaggatg attatcaaigg tctccctctg gaatttggtg cctcagctga aacagttcga 3180. 

gttgaggaag aagaagagga agactggctg gatgatacta ctgagcaatc agagattgag. 3240 

ccagaaccag aacctacacc tgaagaacca gttaatcagt ttactggtta tttaaaactt ■ 3300 

actgacaatg ttgccattaa atgtgttgac atcgttaagg aggcacaaag tgctaatcct 3360 

atggtgattg taaatgctgc taacatacac ctgaaacatg gtggtggtgt agcaggtgca 3420 
■ 

ctcaacaagg caaccaatgg tgccatgcaa aaggagagtg atgattacat taagctaaat . 3480 

ggccctctta cagtaggagg gtcttgtttg ctttctggac ataatcttgc taagaagtgt 3540 

ctgcatgttg ttggacctaa cctaaatgca ggtgaggaca tccagcttct taaggcagca . 3600. 

tatgaaaatt tcaattcaca ggacatctta cttgcaccat tgttgtcagc aggcatattt 3660 

ggtgctaaac cacttcagtc tttacaagtg tgcgtgcaga cggttcgtac acaggtttat 3720 

■ 

attgcagtca atgacaaagc tctttatgag caggttgtca tggattatct tgataacctg 3780 

aagcctagag tggaagcacc taaacaagag gagccaccaa acacagaaga ttccaaaact 3840 
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■ 

• « 

gaggagaaat.- ctgtcgtaca gaagcctgtc gatgtgaagc caaaa^ttaa* ggcctgcatt. 3900 

gatgaggtta ccacaacact ggaagaaact aagtttctta ccaataagtt actcttgttt 3960 

■ 

gctgatatca. atggtaagct ttaccatgat tctcagaaca tgcttagagg .tgaagatatg 4020 

i 

tctttccttg agaaggatgc accttacatg gtaggtgatg ttatcactag tggtgatatc 4080 

acttgtgttg taataccctc caaaaaggct ggtggcacta ctgagatgct ctcaagagct 4140 

ttgaagaaag tgccagttga tgagtatata accacgtacc ctggacaagg atgtgctggt 4200 

tatacacttg aggaagctaa gactgctctt aagaaatgca aatctgcatt ttatgtacta 4*260 

« 

ccttcagaag cacctaatgc taaggaagag attctaggaa ctgtatcctg gaatttgaga 4320 

gaaatgcttg ctcatgctga agagacaaga aaattaatgc ctatatgcat ggatgttaga 4380 

4 

gccataatgg caaccatcca acgtaagtat aaaggaatta aaattcaaga gggcatcgtt • 4440 

ft 

gactatggtg- tccgattctt cttttatact agtaaagagc ctgtagcttc tattattacg 4500 

aagctgaact ctctaaatga gccgcttgtc acaatgccaa ttggttatgt gacacatggt 4560 

tttaatcttg aagaggctgc gcgctgtatg cgttctctta aagctcctgc cgtagtgtca 4620 

gtatcatcac cagatgctgt tactacatat aatggatacc tcacttcgtc atcaaagaca 4680 

tctgaggagc actttgtaga aacagtttct ttggctggct cttacagaga ttggtcctat 4740 

tcaggacagc gtacagagtt aggtgttgaa tttcttaagc gtggtgacaa aattgtgtac 4800 

cacactctgg agagccccgt cgagtttcat cttgacggtg aggttctttb acttgacaaa 4860 

ctaaagagtc tcttatccct gcgggaggtt aagactataa aagtgttcac aactgtggac 4920 

aacaqtaatc tccacacaca gcttgtggat atgtctatga catatggaca gcagtttggt 4980 

ccaacatact tggatggtgc tgatgttaca aaaattaaac ctcatgtaaa tcatgagggt 5040 

' aagactttct ttgtactacc tagtgatgac acactacgta gtgaagcttt cgagtactac 5100 

catactcttg* atgagagttt tcttggtagg. tacatgtctg ctttaaacca caqaaagaaa 5160 

tggaaatttc ctcaagttgg tggtttaact. tcaattaaat gggctgataa caattgttat 5220 

ttgtctagtg ttttattagc acttcaacag cttgaagtca aattcaatgc accagcactt .5280 

caagaggctt attatagagc ccgtgctggt gatgctgcta acttttgtgc actcatactc 5340 

■ 

gcttacagta ataaaactgt tggcgagctt ggtgatgtca gagaaactat gacccatctt 5400 

ctacagcatg ctaatttgga atctgcaaag cgagttctta atgtggtgtg taaacattgt 54-60 

ggtcagaaaa ctactacctt aacgggtgta gaagctgtga tgtatatggg tactctatct 5520 

tatgataatc ttaagacagg tgtttccatt ccatgtgtgt gtggtcgtga tgctacacaa 5580 

tatctagtac aacaagagtc ttcttttgtt atgatgtctg caccacctgc tgagtataaa 5640 
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j • • • ■ ■ 

ttacagcaag gtacattctt atgtgcgaat .gagtacactg gtaactatca gtgtggtcat 5700 
■ • • • 

tacactcata taactgctaa ggagaccctc tatcgtattg acggagctca ccttacaaag * 5760 • 

* • • • . *• . 

atgtcagagt acaaaggacc agtgactgat gttttctaca aggaaaca.tc ttacactaca 5820 

■ 

accatcaagc ctgtgtcgta taaa'ctcgat ggagttactt acacagagat tgaaccaaaa* 5880 

ttggatgggt attataaaaa ggataatgct tactatacag. agcagcctat agaccttgta • 5'940 

• • * * 

ccaactcaac cattaccaaa tgcgagtttt gataatttca aactcacatg ttctaacaca 6000 

* • 

aaatttgctg atgatttaaa tcaaatgaca ggcttcacaa agccagcttc acgagagcta 6060 

tctgtcacat tcttcccaga cttgaatggc gatgtagtgg ctattgacta tagacactat. . 6120 

4 • 

tcagcgagtt tcaagaaagg tgctaaatta ctgcataagc caattgtttg gcacattaac 6180 

• ^ caggctacaa ccaagacaac gttcaaacca aacacttggt gtttacgttg tctttggagt 6240 

acaaagccag tagatacttc aaattcattt gaagttctgg cagtagaaga cacacaagga 6300 

atggacaatc ttgcttgtga aagtcaacaa cccacctctg aagaagtagt ggaaaatcct 6360 

* - • 

I • 

. - accatacaga aggaagtcat agagtgt gac gtgaaaacta ccgaagttgt aggcaatgtc 6420 

atacttaaac catcagatga aggtgttaaa gtaacacaag agttaggtca tgaggatctt 6480 

atggctgctt atgtggaaaa cacaagcatt accattaaga aacctaatga gctttcacta • '6540 

• • ■ ♦ 

gccttaggtt taaaaacaat tgccactcat ggtattgctg caattaatag tgttccttgg 6600 

V" ' • • ■ 

agtaaaattt tggcttatgt caaaccattc ttaggacaag cagcaattac aacatcaaat 6660 

tgcgctaaga gattagcaca acgtgtgttt aacaattata tgccttatgt gtttacatta . 6720 

■ • 

ttgttccaat tgtgtacttt tactaaaagt accaattcta gaattagagc ttcactacct 6780 

^caactattg ctaaaaatag tgttaagagt gttgctaaat tatgtttgga tgccggcatt 6840 

aattatgtga agtcacccaa attttctaaa ttgttcacaa tcgctatgtg gctattgttg. 6900 

ttaagtattt gcttaggttc. tctaatctgt gtaactgctg cttttggtgt actcttatct • 6960 

aattttggtg ctccttctta ttgtaatggc gttagagaat tgtatcttaa ttcgtctaac • 7020 

gttactacta tggatttctg tgaaggttct tttccttgca gcatttgttt aagtggatta 7080 

■ 

gactcccttg attcttatcc agctcttgaa accattcagg tgacgatttc atcgtacaag 7140 

ctagacttga caattttagg tctggccgct gagtgggttt tggcatatat gttgttcaca 7200 

aaattctttt atttattagg tctttcagct ataatgcagg tgttctttgg ctattttgct . 7260. 

agtcatttca tcagcaattc .ttggctcatg tggtttatca ttagtattgt acaaatggca 7320 

cccgtttctg caatggttag gat'gtacatc ttctttgctt ctttctacta catatggaag 7380 

agctatgttc atatcatgga tggttgcacc tcttcgactt gcatgatgtg ctataagcgc 7440 

aatcgtgcca cacgcgttga gtgtacaact attgttaatg gcatgaagag atctttctat 7500 
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• • • 

« 

gtctatgcaa atggaggccg tggcttctgc aagactcaca attggSattg tctcaattgt 7560 
■ • • 

• • • . . ♦ • . 

gacacatttt gcactggtag tacattcatt agtgatgaag ttgctc'gtga tttgtcactc 7620' 

•• 

cagtttaaaa gaccaatcaa ccctactgac cagtcatdgt atattgttga .tagtgttgct 7680 

* 

« 

gtgaaaaatg gcgcgcttca octctacttt gacaaggctg gtcaaaagac ctBtgagag^' 7740 

I 

catccgctct cccattttgt caatttagac aatttgagag ctaacaacac taaaggttca 7800 

ctgcctatta atgtcatagt ttttgatggc aagtccaaat gcgacgagtc tgcttctaag 7860 

tctgcttctg tgtactacag tcagctgatg tgccaaccta ttctgttgct tgaccaagct 7920 

■ ■ . * 

cttgtatcag acgttggaga tagtactgaa gtttccgtta agatgtttga tgcttatgtc 7980 

gacacctttt cagcaacttt tagtgttcct atggaaaaac ttaaggcact tgttgctaca 8040 

gctcacagcg agttagcaaa gggtgtagct ttagatggtg tcctttctac attcgtgtca 810.0 

gctgcccgac aaggtgttgt tgataccgat gttgacacaa aggatgttat tgaatgtctc 8160 

•aaactttcac atcactctga cttagaagtg acaggtgaca gttgtaacaa tttcatgctc 8220 

acctataata aggttgaaaa catgacgccc agagatcttg gcgcatgtat tgactgta&t 8280 

gcaaggcata tcaatgccca agtagcaaaa agtcacaatg tttcactcat ctggaa:tgta 6340 

* 

aaagactaca tgtctttatc tgaacagjctg cgtaaacaaa ttcgtagtgc tgccaagaag 8400 

• » I • * 

aacaacatac cttttagact aacttgtgct acaactagac aggttgtcaa tgtcataact ' 8460 

actaaaatct cactcaaggg tggtaagatt gttagtactt gttttaaact tatgcttaag 8520 

gccacatt^t tgtgcgttct tgctgcattg gtttgtta'ta tcgttatgcc agtacataca 8580 

ttgtcaatcc atgatggtta cacaaatgaa atcattggtt acaaagccat tcaggatggt 8640 

gtcactcgtg acatcatttc tactgatgat tgttttgcaa ataaacatgc tggttttgac 8700 

gcatggttta gccagcgtgg tggttcatac aaaaatgaca aaagctgccc tgtagtagct 8760 

gctatcatta- caagagagat tggtttcata gtgcctggct taccgggtac tgtgctgaga 8820 

gcaatcaatg gtgacttctt gcattttcta. cctcgtgttt ttagtgctgt tggcaacatt 8880 

tgctacacac cttccaaact cattgagtat agtgattttg ctacctctgc ttgcgttctt 8940 

gctgctgagt gtacaatttt taaggatgct atgggcaaac ctgtgccata ttgttatgac 9000 

actaatttgc tagagggttc tatttcttat agtgagcttc gtccagacac tcgttatgtg 9060 
cttatggatg gttccatcat acagtttcct aacacttacc tggagggttc tgttagagta . 9120 
gtaacaactt ttgatgctga gtactgtaga catggtacat gcgaaaggtc agaagtaggt 9180 
atttgcctat ctaccagtgg tagatgggtt cttaataatg agcattacag agctctatca 9240 
ggagttttct gtggtgttga tgcgatgaat ctcatagcta acatctttac tcctcttgtg 9300 
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caacctgtgg gtgctttaga tgtgtctgct .tcagtagtgg ctggtggtat tattgccata 9360 

ttggtgactt gtgctgccta .ctactttatg aaattcagac gtgtttttgg tgagtacaac ' 9420 

t * 

catgttgtt^ ctgctaatgc acttttgttt ttgatgtctt tcactata.ct ctgtctggta " 9480 

ccagcttaca gctttctgcc gggagtctac tcagtctttt acttgtactt gacattctat* 9540 

• < 

ttcaccaatg atgtttcatt cttggctcac cttcaatggt ttgccatgtt ttctcctatt 9600 

m 

gtgccttttt ggataacagc aatctatgta ttctgtattt ctctgaagca ■ ctgccattgg 9660 

t;tctttaaca actatcttag gaaaagagtc atgtttaatg gagttacatt tagtaccttc 9720 

* 

gaggaggctg ctttgtgtac ctttttgctc aacaaggaaa tgtacctaaa attgcgtagc 9180 

gagacactgt tgccacttac acagtataac aggtatcttg ctctatataa caagtacaag 9840 

tatttcagtg gagccttaga tactaccagc tatcgtgaag cagcttgctg ccacttagca 9900 

aaggctctaa atgactttag caactcaggt gctgatgttc tctaccaacc accacagac.a 9960* 

» • • • 

tcaatcactt ctgctgttct gcagagtggt tttaggaaaa tggcattccc gtcaggcaaa 10020 

« « 

gttgaagggt gcatggtaca agtaacctgt ggaactacaa ctcttaatgg attgtggttg 10080 

• • • 

* • 

•gatgacacag tatactgtcc aagacatgtc atttgcacag cagaagacat gcttaatcct 10140 

■ m 

aactatgaag atctgctcat tcgcaaatcc aac£:ata^ct ttcttgttca ggctggcaat '10200 

gttcaacttc gtgttattgg ccattctatg caaaattgtc tgcttaggct taaagttgat 10260 

* • 

acttctaacc ctaagacacc caagtataaa tttgtccgta tccaacctgg tcaaacattt 10320 

tcagttctag catgctacaa tggttcacca tctggtgttt atcagtgtgc catgagacct 10380 

•i 

aatcatacca ttaaaggttc tttccttaat ggatcatgtg gtagtgttgg ttttaacatt 10440 

gattatgatt gcgtgtcttt ctgctatatg catcatatgg agcttccaac aggagtacac 10500 

gctggtdctg acttagaagg taaattctat ggtccatttg ttgacagaca aactgcacag.' 10560 

* * . • 

gctgcaggta cagacacaac cataaciatta aatgttttgg catggctgta tgctgctgtt 10620 

atcaatggtg ataggtggtt tcttaataga ttcaccacta ctttgaatga ctttaacctt 10680 

gtggcaatga agtacaacta tgaacctttg .acacaagatc atgttgacat attgggacct 10740 

ctttctgctc aaacaggaat tgccgtctta gatatgtgtg ctgctttgaa agagctgctg . 10800 

• « 

cagaatggta tgaatggtcg tactatcctt ggtagcacta ttttagaaga tgagtttaca 10860 

• • • 

ccatttgatg ttgttagaca atgctctggt gttaccttcc aaggtaagtt caagaaaatt .10920 

gttaagggca ctcatcattg gatgctttta actttcttga catcactatt gattcttgtt 10980 

caaagtacac agtggtcact gtttttcttt gtttacgaga atgctttctt gccatttact 11040 

cttggtatta tggcaattgc tgcatgtgct atgctgcttg ttaagcataa gcacgcattc 11100 

ttgtgcttgt ttctgttacc ttctcttgca acagttgctt actttaatat ggtctacatg 11160 
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cctgctagct gggtgatgcg tatcatgaca tggcttgaat tggctgacac tagcttgtct 11220 
ggttataggc ttaaggat.tg tgttatgtat gcttcagctt tagttttgct tattctcatg 112B0 

m . 

acagctcgca ctgtttatga tgatgctgct agacgtgttt ggacactgat .gaatgtcatt 11340 

* • * 

* 

acacttgttt acaaagtcta ctatggtaat gctttagatc aagctatttc catgtgggcc 11400 • 

I 

ttagttattt ' ctgtaacctc taactattct ggtgtcgtta cgactatcat gtttttagct 11460 
agagctatag tgtttgtgtg tgttgagtat tacccattgt tatttattac tggcaacacc 11520 

ttacagtgta tcatgcttgt ttattgtttc ttaggctatt gttgctgctg ctactttggc 11580 

' • «... 
cttttctgtt tactcaaccg ttacttcagg cttactcttg gtgtttatga ctapttggtc 11640 

tctacacaag aatttaggta tatgaactcc caggggcttt tgcctcctaa gagtagtatt 11700 

gatgctttca agcttaacat taagttgttg ggtattggag gtaaaccatg. tatcaaggtt 11760 

gctactgtac agtctaaaat ' gtctgacgta aagtgcacat ctgtggtact gctctcggtt 11820 

cttcaacaac ttagagtaga gtcatcttct aaattgtggg cacaatgtgt acaactccac 11880 . 

aatgatattc ttcttgca&a agacacaact gaagctttcg .agaagatggt ttctcttttg 11940 

tctgttttgc tatccatgca gggtgctgta gacattaata ggttgtgcga ggaaatgctc 12000 

gataaccgtg ctactcttca ggctattgct tcagaattta gttctttacc atcatatgcc 12060 

• ■ 

gctt^tgcca ctgcccagga ggcctatgag caggctgtag ctaatggtga ttctgaagtc " 12120 
gttctcaaaa agttaaagaa atctttgaat gtggctaaat ctgagtttga ccgtgatgct- 12180 

« * 

* 

gccatgcaac gcaagttgga aaagatggca gatcaggcta tgacccaaat gtacaaacag 12240 

gcaagatctg aggacaagag ggcaaaagta actagtgcta tgcaaacaat gctcttcact 12300 

atgcttagga agcttgataa tgatgcactt aacaacatta tcaacaatgc gcgtgatggt 12360 

tgtgttccac tcaacatcat accattgact acagcagcca aactcatggt tgttgtccct 12420 

gattatggta cctacaagaa cacttgtgat ggtaacacct ttacatatgc atctgcactc 12480 

tgggaaatcc agcaagttgt tgatgcggat. agcaagattg ttcaacttag tgaaattaac 12540 

atggacaatt caccaaattt ggcttggcct cttattgtta cagctctaag agccaactca 12600 

« 

gctgttaaac tacagaataa tgaactgagt ccagtagcac tacgacagat gtcctgtgcg 12660 

gctggtacca cacaaacagc ttgtactgat gacaatgcac ttgcctacta taacaattcg 12720 

aagggaggta ggtttgtgct ggcattacta tcagaccacc aagatctcaa atgggctaga* 12780 

ttccctaaga gtgatggtac aggtacaatt tacacagaac tggaaccacc ttgtaggttt 12840 

gttacagaca caccaaaagg gcctaaagtg aaatacttgt acttcatcaa aggcttaaac 12900 

aacctaaata gaggtatggt gctgggcagt ttagctgcta cagtacgtct tcaggctgga 12960 
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aatgctacag aagtacctgc caattcaact gtgcttt.cct tctgtgcttt tgcagtagac 13020 

cctgctaaag catataagga ttacctagca agtggaggac aaccaatcac c.aactgtgtg 13080 

aagatgttgt gtacacacac ' tggtacagga caggcaatta ctgtaacacc agaagctaac 13140 

■ • 

atggaccaag agtcctttgg tggtgcttca tgttgtctgt attgtagatg ccacattgac 13200 

'■ . • . • • • - 

catccaaatc ctaaaggatt ctgtgacttg aaaggtaagt acgtccaaat acctaccact 13260 

• * 

tgtgctaatg acccagtggg ttttacactt agaaacacag tctgtaccgt ctgcggaatg 13320 

tggaaaggtt atggctgtag ttgtgaccaa ctccgcgaac ccttgatgca gtctgcggat 13380 

gcatcaacgt ttttaaacgg gtttgcggtg taagtgcagc ccgtcttaca ccgtgcggca 13440 

caggcactag tactgatgtc gtctacaggg cttttgatat ttacaacgaa aaagttgctg 13500 

gttttgcaaa gttcctaaaa actaattgct ,gtcgcttcca ggagaaggat gaggaaggca 13560 

atttattaga ctcttacttt gtagttaaga ggcatactat gtctaactac caacatgaag 13620 

« * 

agactattta • taacttggtt aaagattgtc cagcggttgc tgtccatgac tttttcaagt 13680 

ttagagtaga tggtgacatg gtaccacata tatcacgtca gcgtctaact aaatacacaa 13740 

tggctgattt agtctatgct ctacgtcatt ttgatgaggg taattgtgat -acattaaaag * 13800 

aaatactcgt cacatacaat tgctgtgatg atgattattt caataagaag gattggtatg 138^0 

acttcgtaga -gaatcctgac atcttacgcg tatatgctaa cttaggtgag cgtgtacgcc 13920 

aatcattatt aaagactgta 'caattctgcg atgctatgcg tgatgcaggc attgtaggcg 13980 

tactgacatt agataatcag gatcttaatg ggaactggta cgatttcggt gatttcgtac 14040 

H 

I 

aagtagcacc aggctgcgga gttcctattg tggattcata ttactcattg ctgatgccca 14100 

tcctcacttt gactagggca ttggctgctg agtcccatat ggatgctgat .ctcgcaaaac 14160 

cacttattaa gtgggatttg ctgaaatatg attttacgga agagagactt tgtctcttcg 14220 

accgttattt taaatattgg gaccagacat accatcccaa ttgtattaac tgtttggatg 14280 

■ 

ataggtgtat ccttcattgt gcaaacttta atgtgttatt ttctactgtg tttccaccta 14340 

caagttttgg accactagta agaaaaatat ttgtagatgg tgttcctttt gttgtttcaa 14400 

« 

ctggatacca ttttcgtgag ttaggagtcg tacataatca ggatgtaaac ttacatagct 144 60 

cgcgtctcag tttcaaggaa cttttagtgt atgctgctga tccagctatg catgcagctt 14520 

ctggcaattt attgctagat aaacgcacta catgcttttc agtagctgca ctaacaaaca 14580 

atgttgcttt tcaaactgtc aaacccggta attttaataa agacttttat gactttgctg 14640 

tgtctaaagg tttctttaag gaaggaagtt ctgttgaact aaaacacttc ttctttgctc 14700 

. aggatggcaa cgctgctatc agtgattatg actattatcg ttataatctg ccaacaatgt 147 60 

gtgatatcag acaactcctai ttcgtagttg aagttgttga taaatacttt gattgttacg 14820 
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* 

. atggtggctg tattaatgcc aaccaagtaa tcgttaacaa- tctggataaa tcagctggtt 14860 

■ 

tcccatttaa taa^tgg.ggt aaggctagac tttattatga ctcaatgagt tatgaggatc 14940 

aagatgcact tttcgcgtat actaagcgta atgtcatccc tac.tataact caaatgaatc 15000 

ttaagtatgc ciattagtgca aag^atagag ctcgcaccgt agctggtgtc tctatctgta 15060 

gtactatgac aaatagacag tttcatcaga aattattgaa gtcaatagcc gccactagag 15120- 

gagctactgt ggtaattgga acaagcaagt tttacggtgg ctggcataat atgttaaaaa , 15180 

ctgtttacag tgatgtagaa actccacacc ttatgggttg ggattatcca aaatgtgaca 15240 

* ■ 

* • 

gagccatgcc taacatgctt aggataatgg cctctcttgt tcttgctcgc aaacpataaca 15300 

cttgctgtaa cttatcacac cgtttctaca ggttagctaa cgagtgtgcg caagtattaa 15360 

gtgagatggt catgtgtggc ggctcactat atgttaaacc aggtggaaca tcatccggtg 15420 

atgctacaac tgcttatgct aatagtgtct ttaacatttg tcaagctgtt acagccaatg 15480 

taaatgcact tctttcaact gatggtaata agatagctga caagtatgtc cgcaatctac 15540 

aacacaggct Qtatgagtgt ctctatagaa atagggatgt tgatcatgaa ttcgtggatg 15600 

■ 

agttttacgc ttacctgcgt aaacatttct ccatgatgat tctttctgat gatgccgttg 15660 

tgtgctataa cagtaactat -gcggctcaag gtttagtagc tagcattaag aactttaagg- 15720 

cagticttta ttatcaaaat aatgtgttca tgtctgaggc aaaatgttgg actgagactg 15780 

accttactaa aggacctcac gaattttgct cacagcat.ac aatgctagtt akacaaggag ' 15840 

■ * • 

atgattacgt gtacctgcct taccca'gatc catcaagaat attaggcgca ggctgttttg 15900 

tcgatgatat tgtcaaaaca-gatggtacac ttatgattga aaggttcgtg tcactggcta 15960 

■ • 

ttgatgctta cccacttaca aaacatccta atcaggagta tgctgatgtc tttcacttgt 16020 

atttacaata cattagaaag ttacatgatg agcttactgg ccacatgttg gacatgtatt 16080 

ccgtaatgct aactaatgat aacacctcac ggtactggga acctgagttt tatgaggcta 16140 

• ■ 

tgtacacacc acatacagtc ttgcaggctg taggtgcttg tgtattgtgc aattcacaga 16200 

cttcacttcg ttgcggtgcc tgtattagga gaccattcct atgttgcaag tgctgctatg 16260 

accatgtcat ttcaacatca cacaaattag tgttgtctgt taatccctat gtttgcaatg 16320 

ccccaggttg tgatgtcact gatgt-gacac aactgtatct aggaggtatg agctafctatt 16380 

gcaagtcaca taagcctccc attagttttc cattatgtgc taatggtcag gtttttggtt 16440 

tatacaaaaa cacatgtgta ggcagtgaca atgtcactga cttcaatgcg atagcaacat 16500 

gtgattggac taatgctggc gattacatac ttgccaacac ttgtactgag agactcaagc 16560 

ttttcgcagc agaaacgctc aaagccactg aggaaacatt taagctgtca tatggtattg 16620 



44 



wo 2004/096842 . PCT/CA2004/000626 

ccactgtacg cgaagtactc tctgacagag aattgcatct ttcatgggag gttggaaaac 16680 

ctagaccacc attgaacaga aactatgtct ttactggtta ccgtgtaact ^aaaatagta 16740 

aagtacagat tggagagtac acctttgaaa aaggtgacta tggtgatgct gttgtgtaca 16800 

gaggtactac gac^^.^caag ttgaatgttg gtgattactt tgtgttgaca tctcacactg 16860 

taatgccact tagtgcacct actctagtgc cacaagagca ctatgtgaga attactggct 16920 

tgtacccaac actcaacatc tcagatgagt tttctagcaa tgttgcaaat tatcaaaagg 16980 

tcggcatgca aaagtactct acactccaag gaccacctgg .tactg^taag agtcattttg 17040 

ccatcggact tgctctctat tacccatctg ctcgcatagt gtatacggca tgctctcatg 17100 

cagctgttga tgccctatgt gaaaaggcat taaaatattt gcccatagat aaatgtagta 17160 

.gaatcatacc tgcgcgtgcg cgcgtagagt gttttgataa attcaaagtg aattcaacac 17220 

tagaacagta* tgttttctgc actgtaaatg cattgccaga aacaactgct gacattgt^g .17280 

tctttgatga aatctctatg gctactaatt atgacttgag tgttgtcaat gctagacttc 17340 

gtgcaaaaca ctacgtctat attggcgatc ctgctcaatt accagccccc cgcacattgc 17400 
tgactaaagg cacactagaa ccagaatatt ttaattcagt gtgcagactt • atgaaaacaa ' 17460 

• ■ • 

taggtccaga catgttcctt ggaacttgtc gccgttgtcc tgctgaaatt gttgacactg 17520 

tgagtgcttt agtttatgac aataagctaa aagcacacaa ggataagtca gctcaatgct . 17580 

tcaaaatgtt ctacaaaggt gttattacac atgatgtttc atctgcaatc aacagacctc 17640 

aaataggdgt tgtaagagaa tttcttacac gcaatcctgc ttggagaaaa gctgttttta 17700 

tctcacctta taattcacag aacgctgtag cttcaaaaaf cttaggattg cctacgcaga 17760 

B P 

ctgttgattc atcacagggt tctgaatatg actatgtcat attcacacaa actactgaaa 17820 

cagcacactc ttgtaatgtc aaccgcttca atgtggctat cacaagggca aaaattggca 17880 

ttttgtgcat aatgtctgat agagatcttt atgacaaact gcaatttaca agtctagaaa 17940 

« 

taccacgtcg caatgtggct acattacaag cagaaaatgt aactggactt tttaaggact 18000 

* 

gtagtaagat cattactggt .cttcatccta cacaggcacc tacacacctc agcgttgata 18060 
tAaagttcaa gactgaagga ttatgtgttg acataccagg cataccaaag gacatgacct' 18120 
accgtagact catctctatg atgggtttca. aaatgaatta ccaagtcaat ggttacccta . 18180 
atatgtttat cacccgcgaa gaagctattc gtcacgttcg tgcgtggatt ggctttgatg . 18240 
tagagggctg tcatgcaact agagatgctg tgggtactaa cctacctctc cagctaggat 18300 
tttctacagg tgttaactta gtagctgtac cgactggtta tgttgacact gaaaataaca 18360 
cagaattcac cagagttaat gcaaaacctc caccaggtga ccagtttaaa catcttatac 18420 
cactcatgta taaaggcttg ccctggaatg tagtgcgtat taagatagta caaatgctca 18480 
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« ■ 

gtgatacact gaaaggattg tcagacagag tcgtgttcgt- cctttgggcg catggctttg 18540 

agcttacatc aatgaagtac tttgtcaaga ttggacctga aagaacgtgt tgtctgtgtg 18600 

acaaacgtgc aacttgcttt tctacttcat cagatactta tgcctgctgg aatcattctg 18660 

tgggttttga ctatgtctat aacccattta tgattgatgt tcagcagtgg* ggctttacgg 18720 

I ■ ■ 

gtaaccttca' gagtaaccat. gaccaacatt gccaggtaca tggaaatgca catgtggcta 18780 

gttgtgatgc tatcatgact agatgttfag cagtcca.tga gtgctttgtt aagcgcgttg 18840 

* 

attggtctgt tgaataccct attataggag atgaactgag ggttaattct gcttgcagaa 18900 

* • 

aagtacaaca catggttgtg aagtctgcat tgcttgctga taagtttcca gttpttcatg 18960 

acattggaaa tccaaaggct atcaagtgtg tgcctcaggc tgaagtagaa tggaagttct 19020 

acgatgctca gccatgtagt gacaaagctt acaaaataga ggaactcttc. tattcttatg 19080 

• * 

ctacacatca cgataaattc actgatggtg tttgtttgtt ttggaattgt aacgttgatc 19140 

gttacccagc caatgcadtt .gtgtgtaggt ttgacacaag agtcttgtca aacttgaa'ct 19200 . 

taccaggctg tgatggtggt agtttgtatg .tgaataagca tgcattcca'c actccagctt, 19260 

tcgataaaag tgcattt^ct aatttaaagc aattgccttt cttttactat tctgatagtc 19320 

• ■ ■ 

cttgtgagtc tcatggc^aa caagtagtgt cggatattga ttatgttcca ctcaaatctg 19380 . 

ctacgtgtat tacacgatgc aatttaggtg gtgctgtttg cagacaccat gc^aatgagt 19440 

« 

accgacagta cttggatgca tataatatga tgatttetgc t^gatttagc ctatggattt * 19500 

• ■ 

acaaacaatt tgatacttat aacctgtgga atacatttac caggttacag agtttagaaa 19560 

atgtggctta taatgttgtt aataaaggac actttgatgg acacgccggc gaagcacctg 19620 

• • • 

tttccatcat taataatgct gtttacacaa aggtagatgg tattgatgtg gagatctttg 19680 

• - ■ . 

aaaataagac aacacttcct gttaatgttg catttgagct ttgggctaag cgtaacatta 19740 

aaccagtgcc agagattaag atactcaata atttgggtgt tgatatcgct gctaatactg 19800 

taatctggga ctacaaaaga gaagccccag cacatgtatc tacaataggt gtctgcacaa 19860 

tgactgacat tgccaagaaa cctactgaga gtgcttgttc ttcacttact gtcttgtttg 19920 

■ 

atggtagagt ggaaggacag gtagaccttt ' ttagaaacgc ccgtaatggt gttttaataa 19980 

♦ 

cagaaggttc agtcaaaggt ctaacacctt caaagggacc agcacaagct agcgtcaatg 20040 

gagtcacatt aattggagaa tcagtaaaaa cacagtttaa ctactttaag aaagtagacg 20100 

gcattattca acagttgcct gaaacctact ttactcagag cagagactta gaggatttta 20160 

agcccagatc acaaatggaa actgactttc tcgagctcgc tatggatgaa ttcatacagc 20220 

gatataagct cgagggctat gccttcgaac acatcgttta tggagatttc agtcatggac 20260 
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aacttggcg9 tcttcattta atgataggct tagccaagcg ctcacaagat tcaccactta- :20340 

aattagagga ttttatccct atggacagca cagtgaaaaa ttacttcata acagatgcgc 20400 

• ' . 

aaacaggttc atcaaaatgt gtgtgttctg tg^ittgatct tttacttgat gactttgtcg 20460 

agataataaa gtcac.aagat ttgtcagtga tttcaaaagt ggtcaaggtt acaat'tgact 20520 

atgctgaaat ttcattcatg ctttggtgta aggatggaca tgttgaaacc ttctacccaa 20580 

aactacaagc aagtcaagcg tggcaaccag gtgttgcgat gcctaacttg tacaagatgc 20640 

aaagaatgct tcttgaaaag tgtgaccttc agaattatgg .tgaaaatgct gttataccaa 20700 

aaggaataat gatgaatgtc gcaaagtata ctcaactgtg tcaatactta aatacactta 20760 

« 

ctttagctgt accctacaac atgagagtta ttcactttgg tgctggctct gataaaggag 20820 

» 

ttgcaccagg tacagctgtg ctcagacaat ggttgccaac tggcacacta cttgtcgatt 20880 

• ■ • 

t 

cagatcttaa tgacttcgtc tccgacgcag attctacttt aattggagac tgtgcaacag ,20940 

♦ < ■ 

tacatacggc* taataaatgg gaccttatta ttagcgatat gtatgaccct aggaccaaac 21000 

atgtgacaaa agagaatgac tctaaagaag ggtttttcac ttatctgtgt ggatttataa 21060 

• ■ 

agcaaaaact agccctgggt ggttctatag ctgtaaagat aacagagcat tcttggaatg' 21120 

■ 

ctgaccttta caagcttatg ggccatttct catggtggac agcttttgtt acaaatgtaa 21180 

* 

* 

atgcatcatcatcggaagca tttttaattg gggctaacta tcttggcaag ccgaaggaac .21240. 
aaattgatgg ctataccatg catgctaact acattttctg gaggaacaca aatcctatcc 21300 

» 

agttgtcttc ctalttcactc tttgacatga gcaaatttcc tcttaaatta agaggaactg 21360 

ctgtaatgtc tcttaaggag aatcaaatca atg^itatgat' ttattctctt ctggaaaaag 21420 

gtaggcttat cattagagaa aacaacagag ttgtggtttc aag.tgatatt cttgttaaca 21480 

* . 

actaaacgaa catgtttatt ttcttattat ttcttactct cactagtggt agtgaccttg 21540 

accggtgcac cacttttgat gatgttcaag ctcctaatta cactcaacat acttcatcta 21600 

- A 

tgaggggggt ttactatcct gatgaaattt ttagatcaga cactctttat ttaactcagg 21660 
atttatttct tccattttat tctaatgtta cagggtttca tactattaat catacgtttg 21720 

t 

gcaaccctgt catacctttt aaggatggta tttattttgc tgccacagag aaatcaaatg 21780 

ttgtccgtgg ttgggttttt ggttctacca tgaacaacaa gtcacagtcg gtgattatta 21840 

ttaacaattc tactaatgtt gttatacgag catgtaactt tgaattgtgt gacaaccctt 21900 

tctttgctgt ttctaaaccc atgggtacac agacacatac tatgatattc gataatgcat 21960 

ttaattgcac- tttcgagtac atatctgatg ccttttcgct tgatgtttca gaaaagtcag 22020 

gtaattttaa acacttacga gagtttgtgt ttaaaaataa agatgggttt ctctatgttt 22090 

ataagggcta tcaacctata gatgtagttc gtgatctacc ttctggtttt aacactttga 22140 
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■ * ^ 

• • • 

aacctatttt* taagttgcct cttggtatta acattacaaa tttta^agcc attctt^cag 22200 

ccttttcacc tgctcaagac atttggggca cgtcagctgc agcctatttt gttggctatt 22260 

* ■ 

taaagccaac tacatttatg ctcaagtatg atgaaaatgg tacaatcaca .gatgctgttg 22320 

attgttctca aaatccactt gctgaactca aatgctctgt taagagcttt gagattgaca* 22380 

I 

aaggaattta ccagacctct aatttcaggg. ttgttccctc aggagatgtt gtgagattcc 22440 

ctaatattac aaacttgtgt ccttttggag aggtttttaa tgctactaaa ttcccttctg 22500 

* 

tctatgcatg ggagagaaaa aaaatttcta attgtgttgc tgattactct gtgctctaca 22560 

■ « 

actcaacatt tttttcaacc tttaagtgct atggcgtttc tgccactaag ttgaatgatc '22620 

tttgcttctc caatgtctat gcagattctt ttgtagtcaa gggagatgat gtaagacaaa 22680 

a 

tagcgccags acaaactggt g'ttattgctg attataatta taaattgcca gatgatttca 22740 

tgggttgtgt. ccttgcttgg aatactagga acattgatgc tacttcaact ggtaattata 22800 

attataaata taggtatctt agacatggca agcttaggcc ctttgagaga gacatatcta 22860 . 

atgtgccttt ctcccctgat ggcaaacctt gcaccccacc tgctcttaat tgttattggc 22920 

cattaaatga ttatggtttt tacaccacta ctggcattgg ctaccaacct tacagagttg 22980' 

tagtactttc ttttgaactt ttaaatgcac cggccacggt ttgtggacca aaattatcca 23040 

ctgaccttat taagaaccag tgtgtcaatt ttaattttaa ' tggactcact ggtactggtg 23100 

tgttaactcc ttcttcaaag agatttcaac catttcaaca atttggccgt gatgtttctg 23160 ' 

atttcactga ttccgttcga gatcctaaaa catctgaaat attagacatt tcaccttgcg 23220 

cttttggggg tgtaagtgta attacacctg .gaacaaatgc ttcatctgaa gttgctgttc 232.80 

tatatcaaga tgttaactgc actgatgttt ctacagcaat tcatgcagat caactcacac 23340 

cagcttggcg catatattct actggaaaca atgtattcca gactcaagca ggctgtctta 23400 

taggagctga gcatgtcgac acttcttatg. agtgcgacat tcctattgga gctggcattt 23460 

gtgctagtta ccatacagtt tctttattac . gtagtactag ccaaaaatct attgtggctt 23520 

atactatgtc tttaggtgct gatagttcaa ttgcttactc taataacacc attgctatac 23580 

■ 

ctactaactt ttcaattagc attactacag aagtaatgcc tgtttctatg gctaaaacct 23640 

ccgtagattg taatatgtac atctgcggag attctactga atgtgctaat ttgcttctcc 23700 

aatatggtag cttttgcaca caactaaatc gtgcactctc aggtattgct gctgaacagg 23760 

atcgcaacac acgtgaagtg ttcgctcaag tcaaacaaat gtacaaaacc ccaactttga 23820 

aatattttgg tggttttaat ttttcacaaa tattacctga ccctctaaag ccaactaaga 23880 

ggtcttttat tgaggacttg ctctttaata aggtgacact cgctgatgct ggcttcatga 23940 
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■ a 

* 

agcaatatgg cgaatgcctaggtgatatta atgctagaga tctcatttgt gcgcagaagt 24000 

■ ♦ . - • 

tcaatggact tacagtgttg .ccacctctgc tcactgatga tatgattgct gcctacactg 24060 

ctgctctagt tagtggtact gccactgctg gatggacatt tggtgctggc gctgctcttc 24120 

aaataccttt tgctajbgcaa atggcatata ggttcaatgg cattggagtt acccaaaatg' 24180 

ttctctatga gaaccaaaaa caaatcgcca accaatttaa caaggcgatt agtcaaattc 24240 

aagaatcact tacaacaaca tcaactgcat tgggcaagct gcaagacgtt* gttaaccaga 24300 

^tgctcaagc attaaacaca cttgttaaac aacttagctc taattttggt gcaatiitcaa 24360 

gtgtgctaaa tgatatcctt tcgcgacttg ataaagtcga ggcggaggta caaattgaca 24420 

ggttaattac aggcagactt caaagccttc aaacctatgt aacacaacaa ctaatcaggg 24480 

ctgctgaaat cagggcttct gctaatcttg ctgctactaa aatgtctgag tgtgttcttg 245.40 

gacaatcaaa aagagttgac ttttgtggaa agggctacca ccttatgtcc ttcccaca^g 24€00 

cagccccgca tggtgttgtc ttcctacatg. tcacgtatgt gccatcccag gagaggaact 24660 

tcaccacagc gccagcaatt tgtcatgaag gcaaagcata cttccctcgt gaaggtgttt 24720 

ttgtgtttaa tggcacttct tggtttatjta cacagaggaa cttcttttot ccacaaataa 24780 

ttactacaga caatacattt gtctcaggaa attgtgatgt cgttattggc atcattaaca '24.840 

acacagttta tgatcctctg caacctgagc ttgactcatt caaagaagag ctggacaagt 24900 

■ 

acttcaaaaa tcatacatca ccagatgttg atcttggcga catttcaggc attaacgctt 24960 

■ 

ctgtcgtcaa cattcaaaaa gaaattgacc gcctcaatga ggtcgctaaa .aatttaaatg 25020 

• 

aatcactcat tgaccttcaa gaattgggaa aatatgagca atatattaaa tggccttggt 25080 

^tgtttggct -cggcttcatt gctggactaa ttgccatcgt catggttaca atcttgcttt 25140 

gttgcatgac tagttgttgc agttgcctca agggtgcatg ctcttgtggt tcttgctgca 25200 

« 

agtttgatga ggatgactct. gagccagttc tcaagggtgt caaattacat tacacataaa 25260 

cgaacttatg gatttgttta tgagattttt tactcttaga tcaattactg cacagccagt 25320 

aaaaattgac aatgcttctc ctgcaagtac tgttcatgct acagcaacga taccgctaca 25380 

agcctcactc cctttcggat ggcttgttat tggcgttgca tttcttgctg tttttcagag . 25440 

cgctaccaaa ataattgcgc tcaataaaag atggcagcta gccctttata agggcttcca 25500 

gttcatttgc aatttactgc tgctatttgt taccatctat tcacatcttt tgcttgtcgc .25560 

tgcaggtatg gaggcgcaat ttttgtacct ctatgccttg atatattttc tacaatgcat 25620 

caacgcatgt agaattatta tgagatgttg gctttgttgg aagtgcaaat ccaagaaccc 25680 

attactttat gatgccaact actttgtttg ctggcacaca cataactatg actactgtat 25740 

accatataac agtgtcacag atacaattgt cgttactgaa ggtgacggca tttcaacacc 25800 
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■ 

aaaactcaaa gaagactacc aaattggtgg ttattctgag gataggcact caggtgttaa 25860 

« 

. agactatgtc gttgtacatg gctatttcac cgaagtttac taccagcttg agtctacaca 25920 

• ■ 

aattactaca gacactggta ttgaaaatgc tacattcttc atctttaaca agcttgttaa 25980 

agacccaccgaatgtgcaaa feacacacaat cgacggctct tcaggagttg ctaatccagc 26040 

I 

aatggatcca atttatgatg agccgacgac gactactagc gtfgcctttgt aagcacaaga 26100 

I 
I 

aagtgagtac gaacttatgt actcattcgt ttcggaagaa acaggtacgt taatagttaa 26160 

tagcgtactt ctttttcttg ctttcgtggt attcttgcta gtcacactag ccatccttac 26220 

tgcgcttcga ttgtgtgcgt actgctgcaa tattgttaac gtgagtttag taaaaccaac 26280 

ggtttacgtc tactcgcgtg ttaaaaatct gaactcttct gaaggagttc ctgatcttct 26340 

ggtctaaacg aactaactat tattattatt ctgtttggaa ctttaacatt gcttatcatg -26400 

gcagacaacg gtactattac cgttgaggag cttaaacaac tcctggaaca atggaaccta 26460 

gtaataggtt tcctattcct agcctggatt atgttactac aatttgccta ttctaatcgg 26520 

aacaggtttt tgtacataat aaagcttgtt ttcctctggc tcttgtggcc agtaacactt 26580 

gcttgttttg tgcttgctgc tgtctacaga attaattggg tgactggcgg gattgcgatt 26640 

gcaatggctt .gtattgtagg cttgatgtgg cttagctact tcgttgcttc cttcaggctg 26700 

tttgctcgta cccgctcaat gtggtcattc aacccagaaa caaacattct tctcaatgtg 26760 

cctctccggg ggacaattgt gaccagaccg ctcatggaaa gtgaacttgt cattggtgct 2.6820 

■ 

gtgatcattc gtggtcactt gcgaatggcc ggacactccc tagggcgctg tgacatta.ag 26880 

gacctgccaa aagagatcac tgtggctaca tcacgaacgc tttcttatta caaattagga 26940 

gcgtcgcagc gtgtaggcac tgattcaggt tttgctgcat acaaccgct'a ccgtattgga 27000 

• aactataaat taaatacaga ccacgccggt agcaacgaca atattgcttt gctagtacag 27060 

taagtgacaa cagatgtttc atcttgttga. cttccaggtt acaatagcag agatattgat 27120 

tatcattatg aggactttca ggattgctat ttggaatctt gacgttataa taagttcaat 27180 

agtgagacaa ttatttaagc ctctaactaa gaagaattat tcggagttag atgatgaaga 27240 

acctatggag ttagattatc cataaaacga acatgaaaat tattfctcttc ctgacattga 27300 

ttgtatttac atcttgcgag ctatatcact atcaggagtg tgttagaggt acgactgtac 27360 

tactaaaaga accttgccca tcaggaacat acgagggcaa ttcaccattt caccctcttg 27420 

ctgacaataa atttgcacta acttgcacta gcacacactt tgcttttgct tgtgctgacg 27480 

gtactcgaca tacctatcag ctgcgtgcaa gatcagtttc acceaaactt ttcatcagac 27540 

aagaggaggt tcaacaagag ctctactcgc cactttttct cattgttgct gctctagtat 27600 
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ttttaatact ttgcttcaccattaagagaa. agacagaatg aatgagctca ctttaattga 27660 

cttctatttg tgctttttag .cctttctgct attccttgtt ttaataatgc ttattatatt 27720 

* * » • 

ttggttttca ctcgaaatcc aggatctaga agaaccttgt accaaagtct aaacgaacat •27780 

■ 

gaaacttctc attg^tttga cttgtatttc tctatgcagt tgcatatgca ctgtagtaca 27840 

• . ' ' ' • 

gcgctgtgca tctaataaac ctcatgtgct tgaagatcct. tgtaaggtac aacactaggg 2i9Q0 

• • • • 

gtaatactta tagcactgct tggctttgtg ptctaggaaa ggttttaccf tttcatagat 27960 

ggcacactat ggttcaaaca tgcacaccta atgttactat caactgtcaa gatccagctg 28020 

*■ ' ■ • • • ■ 

gtggtgcgct tatagctagg tgttggtacc ttcatgaagg tcaccaaact gctgcattta 28080 

gagacgtact tgttgtttta aataaacgaa caaattaaaa tgtctgataa tggaccccaa 28140 

m 

tcaaaccaac gtagtgfcccc ccgcattaca tttggtggac ccacagattc aactgacaat 282.00 

aaccagaatg gaggacgcaa tggggcaagg ccaaaacagc gccgScccca aggtttaccc 28260 

aataatactg cgtcttggtt cacagctctc actcagcatg gcaaggagga acttagattc 28320 

cctcgaggcc agggcgttcc aatcaacacc aatagtggtc ' cagatgacca aattggctac 28380 

taccgaagag ctacccgacg agttcgtggt ggtgacggca aaatgaaaga gctcagcccc 28440 

agatggtact tctattacct aggaactggc cca-gaagctt cacttcccta cggcgctaac • 28500 

aaagaaggca tcgtatgggt tgcaactgag ggagccttga atacacccaa agaccacatt 28560 

V.' . * ■ • • ■ 

ggcacccgca atcctaataa caatgctgcc accgtgctac aacttcctca aggaacaaca 28620 

■ • • 

■ • 

ttgccaaaag gcttctacgc agagggaagc agaggcggca gtcaagcctc ttctcgctcc -28680 

■a* • 

I 

tcatcacgta gtcgcggtaa ttcaagaaat tcaactcctg gcagcagtag gggaaattct 28740 

■ 

pctgctcgaa tggctagcgg aggtggtgaa actgccctcg cgctattgct gctagacaga, 28800 

ttgaaccagc ttgagagcaa agtttctggt aaaggccaac aacaacaagg ccaaactgtc. 28860 

actaagaaat ctgctgctga ggcatctaaa aagcctcgcc aaaaacgtac tgccacaaaa 28920 

cagtacaacg tcactcaagc atttgggaga cgtggtccag aacaaaccca aggaaatttc 28980 

ggggaccaag acctaatcag acaaggaact gattacaaac' attggccgca aattgcacaa 29040 

- • • • 

tttgctccaa gtgcctctgc attctttgga atgtcacgca ttggcatgga agtcacacct . 29100 

• ' • 

tcgggaacat ggctgactta tcatggagcc attaaattgg atgacaaaga tccacaattc 29160 

• • • . " 

aaagacaacg tcatactgct gaacaagcac attgacgcat acaaaacatt cccaccaaca . 29220 

gagcctaaaa aggacaaaaa gaaaaagact gatgaagctc agcctttgcc gcagagacaa 29280 

aagaagcagc ccactgtgac tcttcttcct gcggctgaca tggatgattt ctccagacaa 29340 

cttcaaaatt ccatgagtgg agcttctgct gattcaactc aggcataaac actcatgatg 29400 

accacacaag gcagatgggc tatgtaaacg ttttcgcaat tccgtttacg atacatagtc 29460 
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tactcttgtg cagaatgaat tctcgtaact aaacagcaca agtaggttta gttaacttta. 29520 

* • * * • 

atctcacata gcaatcttta atcaatgtgt* aacattaggg aggacttgaa agagccacca 29580 

cattttcatc gaggccacgc ggagtacgat cgagggtaca gtgaataatg ^ctagggagag 29640 

■ 

ctgcctatat ggaagagccc taatgtgtaa aattaatttt agtagtgcta tccccatgtg 29700 

attttaatag cttcttagga gaatgacaaa aaaaaaaaaa aaaaaaaaaa a 29751 



<210>- 16 • 
<211> 47 
<212> DNA 

<213> Severe acute respiratory syndrome virus 
<400> 16 • ' 

..acattttcat cgaggccacg cggagtacga. tcgagggtac agtgaat 47 

■ 

<210> 17 
<211> 32 
<212> DNA 

<213> Severe acute respiratory syndrome virus 

• ■ 

<400> 17 

cgaggccacg cggagtacga tcgagggtac ag 32 

• « 

<210> 18 • • ' -* » 

<211> .339 
<212> Dm 

<i213> Severe acute respiratory syndrome virus 
<400> 18 

acactcatga tgaccacaca aggcagatgg gct^itgtaaa' cgttttcgca attccgttta 60 
cgatacatag tctactcttg tgcagaatga attctcgtaa ctaaacagca caagtaggtt 120 
tagttaactt taatctcaca tagcaatctt taatcaatgt gtaacattag ggaggacttg 180 
aaagagccac cacattttca tcgaggccac gcggagtacg atcgagggta cagtgaataa 240 
tgctagggag agctgcctat atggaagagc cctaatgtgt aaaattaatt ttagtagtgc 300 
tatccccatg tgattttaat agcttcttag gagaatgac 339 



<210> 19 

• <211> 35 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> s2m motif 



<220> 

<221> misc_feature 
<222> (5).. (5) 
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■ 

« 

<223> n is a, c, or t 
<220> 

<221> misc feature 
<222> (23)7; (23) ' 
<223> n is a, * c, or t 

<400> 19 

gccgnggcca cgcsgagtas gancgagggt acags 



<210> 20 

<211> 26 

<212> RNA 

<213> Severe acute respiratory syndrome virvis 
• 

<400> 20 

ucucuaaacg aacuuuaaaa ucugug 



<210> 21 

<211> 16 , , 

<212> RNA 

<213> Severe acute respiratory syndrome* virus 
<400> 21 

- caacuaaacg aacaug . 



<210:> 22 . . • 

<2-ll> 18 ■ • 

<212';i* RNA 

<213> Severe acute respiratory syndrome virus 

■ 

r 

<400> 22 • 
cacauaaacg aacuuai:^g 

<:210> 23 

<211> 16 

<212> RNA 

<213> Severe acute respiratory syndrome virus 

• • * 

<400> 23 
ugaguacgaa cuuaug 



<210> 24 

<211> 18 * . 

<212> RNA 

<213> Severe acute respiratory syndrome virus 

<400> 24 

ggucuaaacg aacuaacu 

<210> -25 

<211> 11 

<212> RNA 

<213> Severe acute respiratory syndrome virus 
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<;400> 25 
aacuauaaau u 



11 



<210> 26 

<211> 17 

<212> RNA 

<213>* Severe acute respiratory syndrome virus 

<400> 26 * ' . 
uccauaaaac gaacaug 



17 



<210> 27 

<211> -.24 

<212> RNA 

<213> Severe acute respiratory syndrome virus 

* 

<400> 27 

ugcucuagua uuuuuaauac uuug 



24 



<210> 28 
<211> 16 
<212> RNA 

<213> Severe acute respiratory syndrome virus 
<400> 28 

agucuaa'acg aacaug * ' 



16 



<210:? 29 . 

<211> 15 

<212> RNA 

<213> Severe acute respiratory syndrome virus 

• <400> 29 
cuaauaaacc ucaug ^ 



15 



<210> 30 
<211> 24 
<212> RNA 

<213> Severe acute respiratory syndrome virus 
<400> 30 

uaaauaaacg aacaaauuaa aaug 



24 



<210> 31 ' • 

<211> 136 
<2.12> DNA 

<213> Equine rhinovirus 
<400> 31 

acccgttacc ctaaaattcc ctcccctttc tcttcactcg ccgaggccac gccgagtagg 60 
accgagggta. cagcgagtct tttagtttaa ggtgttagat gtaaggtacg tgggctttct 120 



tttggtttac ttcttc 



136 
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<210> 32 . ♦ 

<211> 178 * 
<212> .DMA 

<213> . Avian infectious bronchitis . ■ 

. • ■ 
<400> 32 . . 

tagtttagtt taagttagtt tagagtaggt ataaagatgc cagtgccggg gccacgcgga 60 

* 

gtacgatcga gggtacagca ctaggacgcc cattagggga agagctaaat tttagtttaa 120 
gttaagttta attggctaag tatagttaaa atttataggc tagtatagag ttagagca * 178 

<210> 33 . 
<211> 1255 
<212> PRT 

* 

<213> Severe acute respiratory syndrome virus 
<400> 33 

Met Phe He Ph^ Leu Leu Phe Leu.Thr Lea Thr Ser Gly Ser Asp Leu 
1 5 . .10. 15 



Asp Arg Cys Thr Thr Phe Asp Asp Val Gin Ala Pro Asn Tyr Thr Glh 

20 25 . 30 



His Tlir Ser Ser Met Arg Gly Val Tyr Tyr Pro Asp Glu lie Phe Arg 
35-' 40 ' 45 



Ser Asp Thr Leu Tyr Leu Thr Gin Asp Leu Phe Leu Pro Phe Tyr Ser 
50 • • 55 60 



Asn Val Thr Gly Phe His Thr lie Asn His Thr Phe Gly hsn Pro Val 
65 . ' 70 75 80 



He Pro Phe Lys Asp Gly He Tyr Phe Ala Ala Thr Glu Lys Ser Asn 

85 90 95 



Val Val Arg Gly Trp Val Phe Gly Ser Thr Met Asn Asn Lys Ser Gin 

.100 105 110 



Ser Val He He He Asn Asn Ser Thr Asn Val Val lie Arg Ala Cys 
115 . 120 • 125 



Asn Phe Glu Leu Cys Asp Asn Pro Phe Phe Ala Val Ser Lys Pro Met 
130 135 140 



Gly Thr Gin Thr His Thr Met He Phe Asp Asn Ala Phe Asn Cys Thr 
14S 150 155 160 
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Phe Glu Tyr lie Ser Asp Ala Phe Ser Leu Asp Val Ser Glu Lys Ser 

165. • 170 . ; • • 175 ■ 



Gly Asn Phe Lys His Leu Arg Glu Phe Val Phe Lys Asn-.Lys Asp.Gly 

180 • 185 ' 190 



Phe Leu Tyr Val Tyr Lys Gly Tyr Gin Pro lie Asp Val Val| Arg Asp* 
195 200 205 



teu Pro Ser Gly Phe Asn Thr Leu Lys -Pro lie Phe = Lys Lfeu Pro. lieu 
. 210 215 220 



Gly lie Asn lie Thr Asn Phe Arg Ala lie Leu Thr Ala Phe Ser Pro 
225 230 • -235 240 



Ala Gin Asp lie Trp Gly Thr Ser Ala Ala Ala Tyr Phe Val Gly Tyr 

245 ' ■ ' 250. 255 . 



Leu Lys Pro Thr Thr Phe Met lieu Lys Tyr Asp Glu Asn Gly Thr lie 

260 265 270 



Thr Asp Ala Val Asp Cys Ser Gin Asn 'Pro Leu Ala Glu Leu Lys Cys 
• 275 • 280 285 ■ 



• 4. 



Ser Val Lys Ser Phe Glu He Asp Lys Gly- He Tyr Gin Thr Ser Asn 
290 295 300 



Phe Arg Val Val Pro Ser Gly Asp Val Val Arg Phe Pro Asn He Thr 
305 • ■ 310 315 320 



Asn Leu Cys Pro Phe Gly Glu Val Phe Asn Ala Thr Lys Phe Pro Ser 

.325 330 335* 

Val Tyr Ala Trp Glu Arg Lys Lys He Ser Asn Cys Val Ala Asp Tyr 

340 • 345 350 

• • • 

Ser Val Leu Tyr Asn Ser Thr Phe Phe Ser Thr Phe Lys Cys Tyr Gly- 
355 360 • 365 



Val Ser Ala Thr Lys Leu Asn Asp Leu Cys Phe Ser Asn Val Tyr Ala 
370 375 380 



Asp Ser Phe Val Val* Lys Gly Asp Asp Val Arg Gin He Ala Pro Gly 
385 390 395 400 



Gin Thr Gly Val He Ala Asp Tyr Asn Tyr Lys Leu Pro Asp Asp Phe 
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■ 

405 410 .• 415 



Met Gly Cys Val Leu Ala Trp Asn Thr Arg Asn lie Asp Ala Thr Ser 

♦ 420 425 ... 430 



Thr Gly Asn Tyf 'Asn Tyr Lys Tyr Arg Tyr Leu Arg His.Gly Lys Leu 
'435 440 445 



Arg Pro Phe Glu Arg Asp lie Ser Asn Val Pro Phe Ser Pro Asp Gly 
450 455 460 



Lys Pro Cys Thr Pro Pro Ala Leu- Asn Cys Tyr Trp Pro Leu Asn Asp 
465 470 475 480 



Tyr Gly Phe Tyr Thr Thr Thr Gly lie Gly Tyr Gin Pro Tyr Arg Val 

485 490 495 



•Val Val Leu Ser Phe Glu Leu Leu Asn Ala Pro Ala Thr Val Cys Gly 

500 505 510 



Pro Lys Leii Ser Thr Asp Leu lie Lys Asn Gin Cys Val Asn Phe Asn 
515 520 525 



Phe Asn Gly Leu Thr Gly Thr Gly Val Leu Thr Pro Ser Ser Lys Arg 
• 530 . 535 540 



Phe Gin Pro Phe Gin- Gin Phe Gly Arg Asp Val Ser Asp Phe Thr Asp 
545 * 550 - 555 • 560 



Ser Val Arg Asp Pro liys Thr Ser Glu He Leu Asp He Ser Pro Cys 

565 570 575 



Ala Phe Gly Gly Val Ser Val He Thr Pro Gly Thr Asn Ala- Ser Ser 

580 585 590 



Glu Val Ala Val Leu Tyr Gin Asp Val Asn Cys Thr Asp Val Ser Thr 
595 600 605 



Ala lie. His Ala Asp Gin Leu Thr Pro Ala Trp Arg He Tyr Ser Thr 
610 615 620 



Gly Asn Asn Val Phe Gin Thr Gin Ala Gly Cys Leu He Gly Ala Glu 
625 630 635 640 



His Val Asp Thr Ser Tyr Glu Cys Asp He Pro He Gly Ala Gly He 

645 650 655 

■ 
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Cys Ala Ser Tyr His Tlir Val Ser Leu Leu Arg Ser Thr Ser Gin Lys 
• 66P 665 670 



Ser He Val Ala Tyr Thr' Met Ser Leu Gly Ala Asp Ser* Ser He Ala 

675 680 685 • 

I 

* * m 
* • 

Tyr Ser Asn Asn Thr He Ala He Pro Thr Asn Phe Ser He Ser He 
690 695 700 . 



Thr Thr Glu Val Met Pro Val Ser Met Ala Lys Thr . Sex Val Asp Cys 
705. 710 715 720 



Asn Met Tyr He Cys Gly Asp Ser Thr Glu Cys Ala Asn teu Leu Leu 

725 730 735 



Gin Tyr Gly Ser Phe Cys Thr Gin Leu Asn Arg Ala Leu Ser Gly He 

740 745- .750 



Ala Ala Glu Gin Asp. Arg Asn Thf Arg Glu Val Phe Ala Gin Val Lys 
755 760 .765 



Gin Met Tyr* Lys Thr Pro Thr Leu Lys Tyr Phe Gly Gly Phe Asn Phe 
7^70 . • '775 780 



Ser Gin He Leu Pro Asp Pro' Leu Lys Pro Thr Lys Arg .Ser Phe He 
785 790 795 800 



Qlu Asp Leu Leu Phe Asn Lyis Val Thr Leu Ala Asp Ala Gly Phe Met 

805 810 815 . 



Lys Gin Tyr Gly Glu Cys. Leu Gly Asp lie Asn Ala Arg Asp Leu He 

820 825 830 



Cys Ala Gin Lys Phe Asn Gly Leu Thr Val Leu Pro Pro Leu Leu Thr 
835 840 845 



Asp Asp Met He Ala Ala Tyr Thr Ala Ala Leu Val Ser Gly Thr Ala 
850 855 860 



Thr Ala Gly Trp Thr Phe Gly Ala Gly Ala Ala Leu Gin He Pro Phe 
865 870 875 880 



Ala Met Gin Met Ala Tyr Arg Phe Asn Gly He Gly Val Thr Gin Asn 

885 890 895 



• > 
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Val Leu Tyx Glu Asn Gin Lys Gin lie Ala Asn Gin Phe Asn Lys Ala 

900 • • . ■ 905 • . ' . 910 . 



lie Ser- Gin He Gin Glu Ser Leu Thr Thr Thr Ser Thr 'Ala Leu Gly 
915 920 925 

w 

Lys Leu Gin Asp Val Val Asn Gin Asn Ala Gin' Ala Leu Asn Thr Leu 
930 935 940 



Val Lys Gin Leu Ser Ser Asn Phe Gly Ala He Ser Ser Val X,eu Asn 
945 .950 .* 955 960 



Asp He Leu Ser Arg Leu. Asp Lys Val Glu Ala Glu Val Gin He Asp 

965 970 975 



Arg Leu He Thr Gly Arg Leu Gin Ser Leu Gin Thr Tyr Val Thr Gin 

' 986 985 990 



Gin Leu He Arg Ala Ala Glu He Arg Ala Ser Ala Asn Leu Ala Ala 
995 1000 1005 



Thr Lys Met Ser Glu Cys Val Leu Gly Gin Ser Lys Arg Val Asp 
1010- 1015 i 1020 



Phe Cys Gly Lys* Gly Tyr His Leu Met Ser Phe Pro Gin Ala Ala 
.1025 ' 1030 1035 



Pro His Gly Val Val Phe Leu His Val Thr Tyr Val Pro Ser Gin 
1040 .1045 1050 



Glu Arg Asn Phe Thr Thr Ala Pro Ala He Cys His Glu Gly Lys 
1055 1060 1065 



Ala Tyr Phe Pro Arg Glu Gly Val Phe Val Phe Asn Gly Thr Ser 
1070 1075 1080 



Trp Phe He Thr Gin Arg Asn Phe Phe Ser Pro Gin He He Thr 
1085 1090 1095 



Thr Asp Asn Thr Phe Val Ser Gly Asn Cys Asp Val Val He Gly 
1100 1105 1110 



He He Asn Asn Thr Val Tyr Asp Pro Leu Gin Pro Glu Leu Asp 
1115 1120 1125 
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• • • 

Ser Phe Lys Glu Glu Leu Asp Lys Tyr Phe Lys Aari His Thr Ser 
1130 - • . 1135 ..1140* 

Pro Asp Val Asp Leu Gly Asp .lie S^r Gly He Asn Ala Ser Val 

' 1145 1150^ 1155 • . 

* 

Val Asn He Gin Lys Glu He Asp Arg Leu Asn Glu Val |Ala Lys 
1160 . 1165 1170 

i 

Asn Leu Asn Glu Ser Leu He Asp Leu Gin Glu Leu Gly Lys Tyr 
1175 1180 1185 

Glu Gin Tyr He Ly5 Trp Pro Trp Tyr Vai Trp Leu Gly Phe He 
1190 , 1195 1200 

• ■ ^ 

Ala Gly Leu He Ala He Val Met Val Thr He Leu Leu Cys Cys 
1205 - .1210 ■ 1215 . 

Met Thr Ser Cys Cy$ Ser Cys Leu Lys Gly Ala Cys Ser Cys Gly 

■ 1220 1225 . ■ 1230 

- 

■ 

Ser Cys Cys Lys- Phe Asp Glu Asp Asp Ser Glu Pro Val Leu Lys 
1235 1240 . 1245 

I 

Gly Val Lys Leu His Tyr Thr 
1250 • 1255 



<210> 34 

<211> 220 

<212> PRT 

<213> Severe acute respiratory syndrome virus 

<400> 34 

Met Ala Asp Asn Gly Thr He Thr Val Glu Glu Leu Lys Gin Leu Leu 
1 5 , . 10 15 



Glu Gin Trp Asn Leu Val He Gly Phe Leu Phe Leu Ala Trp He Met 

20 .25 30 



Leu Leu -Gin Phe Ala Tyr Ser Asn Arg Asn Arg Phe Leu Tyr He He 
• 35 40 45 



Lys Leu Val Phe Leu Trp Leu Leu Trp Pro Val Thr Leu Ala Cys Phe 
50 55 60 



Val Leu Ala Ala Val Tyr Arg He Asn Trp Val Thr Gly Gly He Ala 
6^ 70 75 80 
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• * 



He Ala Met Ala Cys He Val Gly Leu Met .Trp Leu Ser Tyr Phe Val 

85 . 90 95 • 



Ala Ser Phe* Argj^Leu Phe Ala Arg Thr Arg Ser Met Trp Ser Phe Asn 

100 " 105 110 • 



Pro Glu Thr Asn He Leu Leu Asn Val Pro Leu Arg Gly Thr He Val 
115 • 120 ' . 125 



Thr Arg Pro Leu Met Glu Ser Glu Leu Val He Gly. Ala Val He He 
130 135 140 



Arg Gly His Leu Arg Met .Ala Gly His Ser Leu Gly Arg Cys Asp lie 
145 150 155 . 160 



Lys Asp Leu Pro Lys Glu He Thr Val Ala Thr Ser Arg Thr Leu Ser 

165 170 • 175 



Tyr Tyr Lys Leu Gly Ala Ser Gin Arg Val Gly Thr Asp Ser Gly Phe . 

180 185 ■ 190 



Ala TU-a Tyr Asn Arg Tyr Arg He Gly Asn Tyr Lys Leu Asn Thr Asp * 
195 200 205 



His Ala Gly Set Asn Asp Asn He Ala Leu- Leu Val 
210 215 220 



<:210> 35 

<211> 76 

<212> PRT 

<213> Severe acute respiratory syndrome virus 

« * 

<400> 35 

Met Tyr Ser Phe Val Ser Glu Glu Thr Gly Thr Leu He Val Asn Ser 
1 , 5 10 .15 



Val Leu Leu Phe Leu Ala Phe Val Val Phe Leu Leu Val Thr Leu Ala 

20 • 25 30 



He Leu Thr Ala Leu Arg Leu Cys Ala Tyr Cys Cys Asn He Val Asn 
35 40 45 



Val Ser Leu Val Lys Pro Thr Val Tyr Val Tyr Ser Arg Val Lys Asn 
50 55 60 
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■ 

Leu Asn Ser Ser Glu C31y Val Pro Asp Leu .Leu Val . 
65 70 . 75 . • * 



<210> 36 . 

<211> 422 - ' ■ • , . 

■<212> PRT 

<213> Severe acute respiratory syndrome virus - 

<400> 36 ^ - 

Met Ser Asp Asn Gly Pro Gin Ser Asn Gin Arg Ser Ala Pro Arg lie 
1 ■ 5 ' 10 15 



Thr Phe Gly Gly Pro Thr Asp Ser Thr Asp Asn Asn Gin. Asn Gly Gly 

• 20 ■ 25 30 



Arg Asn Gly Ala TUrg Pro Lys Gin Arg Arg Pro Gin Gly Leu Pro Ash 
35 40 45 



Asn Thr Ala Ser Trp Phe Thr Ala Leu Thr Gin His Gly Lys Glu Glu 
50 .55 60 



Leu Arg Phe Pro Arg Gly Gin Gly Val Pro lie Asn Thr Asn Ser Gly 
65 70 75 80 

• ■ 

Pro Asp Asp Gin lie Gly Tyr Tyr Arg Arg Ala Thr Arg Arg Val Arg 

. 85 * " * 90' 95 



Gly Gly Asp Gly Lys Met Lys Glu Leu Ser Pro Arg Trp Tyr Phe Tyr 

100 105 ' 110 



Tyr Leu Gly Thr Gly Pro Glu Ala Ser Leu Pro Tyr Gly Ala Asn Lys 
115 12'0 125 



Glu Gly He Val * Trp Val Ala Thr Glu Gly Ala Leu Asn Thr- Pro Lys 
-130 135 140 



Asp His He Gly Thr Arg Asn Pro Asn Asn Asn Ala Ala Thr Val Leu 
145 150 155 160 



Gin Leu. Pro Gin Gly Thr Thr Leu Pro Lys Gly Phe Tyr Ala Glu Gly 

165 170 175 



. Ser Arg Gly Gly Ser Gin Ala Ser Ser Arg Ser Ser Ser Arg Ser Arg 

180 ,185 190 



Gly Asn Ser Arg Asn Ser Thr Pro Gly Ser Ser Arg Gly Asn Ser Pro 
195 200 205 
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Ala Arg Met Ala Ser. Gly Gly Gly Glu Thr Ala Leu Ala Leu Leu Leu 
210 • . 215 220 



Leu Asp Arg LeUjAsn Gin' Leu Glu Ser Lys Val Ser Giy Lys Gly Gin 
225 ' ' " 230 * 235 ■ 240' 



Gin Gin Gin Gly Gin Thr Val Thr Lys Lys Ser Ala Ala Glo Ala Ser 

245 • 250 255 



hyjs Lys Pro Arg- Gin Lys Arg Thr Ala Thr Lys Gln.Tyr Asn Val Ihr 

260 265 270 



Gin Ala Phe Gly Arg Arg Gly Pro Glu Gin Thr Gin Gly Asn Phe ciy 
275 280 '285 

Asp Gin Asp Leu He Arg Gin Gly Thr Asp Tyr Lys His Trp Pro Gin 
290 295 300 . " • 



■ He Ala Gin Phe Ala Pro Ser Ala Ser Ala Phe Phe Gly Met Ser Arig 
305 310 315 320 



He Gly Met Glu Val Thr Pro Ser Gly Thr Trp Leu Thr Tyr His Gly 
i'* 325 330 335 



Ala He Lys he^tx Asp Asp Lys Asp Pro Gin Phe Lys Asp Asn Val He 

340 345 350 ' 



teu L^u Asn Lys His He Asp Ala Tyr Lys Thr Phe Pro Pro Thr Glu 
355 360 365 



Pro Lys Lys Asp Lys Lys, Lys Lys Thr Asp Glu Ala Gin Pro Leu Pro 
370 375 380 



Gin Arg Gin Lys Lys Gin Pro Thr Val Thr Leu Leu Pro Ala Ali^ Asp 
385 390 395 400 



Met Asp Asp Phe Ser Arg Gin Leu Gin Asn Ser Met Ser Gly Ala Ser 

405 410 . 415 



Ala Asp Ser Thr Gin Ala 

420 



<210> 37 
<211> 230 
<212> PRT 
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<213> Bovine corona virus 



<400> 37 



Met Ser Ser- Val Thr Thr Pro Ala Pro .Val Tyr Thr Trp Thr Ala Asp 
1 ' * • 5 ' 10- ' * , 15 



Glu Ala He Lys Phe Leu Lys Glu Trp Asn Phe Ser Leu Gly(lle He 

20 25 30 



Leu Leu Phe He Thr Val He Leu Gin Phe Gly Tyr Thr Ser Arg Ser 
35 40 45 



Met Phe Val Tyr Val He Lys Met Val He Leu Trp Leu Met Trp Pro 
50 • 55 • 60 * 



Leu Thr He He Leu Thr He Phe Asn Cys Val Tyr Ala Leu Asn T^n 
65 . 70 ■ 75 • 80 



Val Tyr Leu Gly Phe Ser .He Val Phe Thr He Val Ala He He Met 

85 90 95 



Trp He Val Tyr Phe Val Asn Ser He Arg. Leu Phe lie Arg Thr Gly 

100 105 110 

■ 

I 



Ser Trp Trp Ser Phe Asn Pro Glu Thr Asn Asn Leu Met Cys He Asp 
115 120 125 



Met Lys Gly Arg Met Tyr Val Arg Pro He He Glu Asp Tyr His Thr 
130 135 140 



Leu Thr Val Thr He He Arg Gly His Leu Tyr Met Gin Gly He Lys 
■ 145 * 150 155 160 * 



Leu Gly Thr Gly Tyr Ser Leu Ser Asp Leu Pro Ala Tyr Val Thr Val 

165 .170 175 



Ala Lys Val Ser His Leu Leu Thr Tyr Lys Arg Gly Phe Leu Asp Lys 

180 185 190 



He Gly Asp Thr Ser Gly Phe Ala Val Tyr Val Lys Ser Lys Val Gly 
195 200 205 



Asn Tyr Arg Leu Pro Ser Thr Gin Lys Gly Ser Gly Leu Asp Thr Ala 
210 215 220 



Leu Leu Arg Asn Asn He 
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225 230 

<210> 38 . 

<211> 226 

<212> PET 

<213> Avian infectious * bronchitis virus 

' 1 1 -I 



• . . 



• 



<400> 38 

• ■ * • • 

Met Ser Asn Gly Thr Glu Asn Cys Thr Leu Ser Thr Gin Gin Ala Ala 
1 . 5 * ' 10 . 15 



Glu Leu Phe Lys Glu.Tyr Asn Leu Phe lie Thr Ala.Phe Leu Leu Phe 

20 • • 25 30 



Leu Thr He Leu Leu Gin Tyr Gly Tyr Ala Thr Arg Ser Arg Phe He 
35 ; 40 45 ' 



Tyr He Leu Lys Met He Val Leu Trp Cys Phe Trp Pro Leu Asn He 
50 55 . : • 60 • 

\ 

■ 

Ala Val Gly He He. Ser Cys He Tyr Pro Pro Asn Thr Gly Gly Leu 
65 70 75 . "80 



Val Ala Ala He He Leu Thr Val Phe Ala Cys Leu Ser Phe Val. Gly 
r 85 90 • 95 • 



Tyr Trp He Gin Ser Phe Arg Leu Phe Lys Arg Cys Arg Ser Trp Trp 

100 105 110 



§er Phe Asn Pro Glu Ser Ash Ala Val Gly Ser He Leu Leu Thr Asn- 
115 120 125 



Gly Gin Gin Cys Asn Phe. Ala lie Glu Ser Val Pro Met Val Leu Ser 
130 135 140 



Pro He He Lys Asn Giy Ala Leu Tyr Cys Glu Gly Gin Trp LeU Ala 
145 150 155 160' 



Lys Cys Glu Pro Asp His Leu Pro Lys Asp He Phe Val Cys Thr fro 

165 170 • 175 



Asp Arg Arg Asn He Tyr Arg Met Val Gin Lys Tyr Thr Gly Asp Gin 

180 185 190 



Ser Gly Asn Lys Lys Arg Phe Ala Thr Phe Val Tyr Ala Lys Gin Ser 
195 200 205 
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Val Asp Thr Gly Glu Leu Gly Ser Val Ala- Thr- Gly Gly Ser Ser Leu 
210 215 220 



Tyr Thr 
225 



<210> 39 • . . 

<211> 262 ' . • 

<212> PRT 

<213> Transmissible gastroenteritis virus 

<"400> .39 ' 

Met Lys He Leu Leu He Leu Ala'Cys Val He Ala Cys Ala Cys Gly 
•1 5 • • 10 ' • 15 



Glu Arg Tyr Cys Ala Met Lys Ser Asp Thr Asp Leu Seif Cys Arg Asn 

20 . 25 . 30 ■ . 



Ser Thr Ala Ser Asp Cys . Glu 3er Cys Phe Asn Gly Gly Asp Leu I'le 
35 . 40 ■ 45 



Trp His Leu Ala Asn Trp Asn Phe Ser -Trp Ser He He Leu He Val 
50 55 ■ . 60 



K 

Phe He Thr Val Leu Gin Tyr Gly Arg Pro Gin Phe Ser Trp Phe Val 
65 70 75 - 80 



Tyr Gly He Lys Met Levi He Met Trp Leu Leu Trp Pro Val Val Leu 

85 90 95 



Ala Leu Thr He Phe Asn Ala Tyr Ser Glu Tyr Gin Val Ser Arg Tyr 

100 105 110 



Val Met Phe Gly Phe Ser He Ala Gly Ala He Val Thr Phe Val Leu 
H5 120 . 125 



Trp He Met Tyr Phe Val Arg Ser He Gin Leu Tyr Arg Arg Thr Lys 
130 135 140 



Ser Trp Trp Ser Phe Asn Pro Glu Thr Lys Ala He Leu Cys Val Ser 
145 150 155 160 



Ala Leu Gly Arg Ser Tyr Val Leu Pro Leu Glu Gly Val Pro Thr Gly 

165 170 175 



Val Thr Leu Thr Leu 



Leu Ser Gly Asn Leu Tyr Ala Glu Gly Phe Lys 



wo 2004/096842 . PCT/CA2004/000626 

9 t • 

180 185 190 



lie Ala Gly Gly Met Asn lie Asp Asn Leu Pro Lys Tyr Val Met Val 
195 200 . 205 

• « ■ 

Ala Leu Pro Ser Arg Thr lie Val Tyr Thr Leu Val Gly Lys Lys Le\l • 
210 215 .220 



Lys Ala Ser Ser Ala Thr Gly Trp Ala Tyr Tyr Val Lys Ser Lys Ala 
225 230 235 . 240 



Gly Asp Tyr Ser Thr Glu Ala Arg Thr Asp Asn Leu Ser Glu Gin Glu 

245 250 255 



Lys Leu Leu His Met Val' 

260 



<210> 40 ■ 

<211> 263 

<212> PRT 

<213> feline coronavirus 

<400> 40 . 

Met Lys. He Leu Leu He Leu Ala Cys Ala Val Ala Cys Val Tyr Gly 
1 5 10 15 • 



Glu Gin -He Arig Tyr Cys Ala Met Gin Glu Thr Gly Leu Ser Cys- Arg 

20 25 30 



Asn Gly Thr Ala Ser Asp Cys Glu Ser Cys Phe Asn Gly Gly Asp Leu 
• ■ • 35 40 ■ .45 



He Trp His Leu Ala Asn Trp Asn Phe Ser Trp Ser He He Leu He 
50 55 60 



Val Phe He Thr Val Leu Gin Tyr Gly Arg Pro Gin Phe Ser Trp Phe 
65 .70 75 - 80 



Val Tyr Gly He Lys Met Leu He Met Trp Leu Leu Trp Pro He Val 

85 90 ■ 95 



Leu Ala Leu Thr He Phe Asn Ala Tyr Ser Glu Tyr Glu Val Ser Arg 

100 105 110 



Tyr Val Met Phe Gly Phe Ser Val Ala Gly Ala Val Val Thr Phe Ala 
115 120 125 
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Leu Tirp Met Met Tyr- Phe Val TVrg Ser He Gin. Leu Tyr Arg Arg Thr 
130 * 135 . 140 ■ 



• « 



Lys Ser Trp Trp Ser Phe Asn Pro Glu Thr Asn Ala He Leu Cys Val 
145 150 155 160 

Asn Ala Lea Gly Arg Ser Tyr Val Leu Pro Xeu Asp Gly Thr Pro Thr 

165 170 ♦ 175 



Gly Val Thr Leu Thr. Leu Leu Ser Gly Asn Leu Tyr Ala Glu Gly Phe 

180 185 190 



Lys, Met Ala Gly Gly Leu Thr He Glu His Leu Pro Lys Tyr Val Met 
195 200 .205 

He Arg Thr Pro Asn Arg Thr He Val Tyr.Thr Leu Val Gly Lys Gin 
210 215 220 



Leu Lys Ala Thr Thr Ala Thr Gly Trp Ala. Tyr Tyr Val Lys Ser' Lys 
'225 230 235 24*0 



Ala Gly Asp Tyr Ser. Thr Glu Ala Ar^g Thr Asp Asn Leu Ser Glu His . 

245 250 255 



Glu Lys Leu Leu His Met Val 

'260 



<210> 41 

<:211> 231 . . 

<212> PRT 

<213> Human coronavirus OC43 

MSSKTTPAPVYIWTADEAIKFLPCEWNFSLGIILLFITIILQFGYTSRSMFVYVIKMIILWLMWPLTIILTIFNCVY 

ALNNVYLGLSIVFTIVAIIMWJVyFVNSIRLFIRTGSFWSFNPETNNLMCIDMKGTMYVRPIIEDYHTLTVTIIRG 

HLYIQGIKLGTGYSWADLPAYMTVAKVTHLCTYKRGFLDRISDTSGFAVYVKSKVGNYRLPSTQKGSGMDTALLRN 
NI ■ ' 

mm 

<SEQ ID NO : 37 ;prt; Porcine hemagglutinatinfl encephalomyelitis .virus 

• • • 

<400> 41 . 

* * 
* 

Met Ser Ser Pro Thr Thr Pro Val Pro Val He Ser Trp Thr Ala Asp 
1 , 5 • 10 15 • 



Glu Ala He Lys Phe Leu Lys Glu Trp Asn Phe Ser Leu Gly He He 

20 25 30 



Val Leu Phe He Thr He He Leu Gin Phe Gly Tyr Thr Ser Arg Ser 
35 40 45 
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Met Phe Val Tyr Val He L^s Met Val He Leu Trp Lfeu Met Trp Pro 
50 • • 55. • . 60 . 



Leu Thr- He He Leu Thr He Phe Asn Cys Val Tyr Ala Leu Asn Asn 
65 . 70 '75 "80 



Val Tyr Leu Gly Phe Ser He Val Phe Thr He Val Ala He He Met 

85 . 90 . 95 . 



Trp Val Val Tyr Phe Val Asn Ser He Arg Leu Phe He Arg Thr Gly 

100 105 . 110 



Ser Trp Trp Ser Phe Asn Pre Glu Thr Asn Asn Leu Met Cys He Asp 
115 120 125 



Met Lys Gly Argf Met Tyr Val Arg Pro He He Glu Asp Tyr His Thr 
130 • " 135 140 



Leu Thr Ala Thr He He Arg Gly His Leu Tyr He Gin Gly He Lys 
145 150 155 160 



Leu Gly Thr Gly Tyr Ser Leu Ser Asp Leu Pro. Ala Tyr Val Thr Val 

•165 ' ' 170 175 



Ala Lys Val Thr His Leu Cys Thr Tyr Lys Arg Gly Phe Leu Asp Arg 

180 185 190 



ile Gly Asp Thr Ser Gly Phe' Ala. Val Tyr Val Lys Ser Lys Val Gly 
19.5 200 205 



Asn Tyr Arg Leu Pro Ser Thr His Lys Gly Ser Gly Met Asp Thr Ala 
210 215 220 



Leu Leu Arg Asn Asn- He Met 
225 230 



<210> 42 

<211> 223 . ' 

<212> PRT 

<213> Avian infectious bronchitis virus 

<400> 42 

Met Met Glu Asn Cys Thr Leu Asn Leu Glu Gin Ala Thr Leu Leu Phe 
1 5 10 -15 



■ I 



Lys Glu Tyr Asn Leu Phe lie Thz Ala Phe Leu Leu Phe Leu Thr lie 
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• • • 

♦ 

20 25 30 ■ • 



■ « 



Leu Leu Gin Tyr Gly Tyr Ala Thr Arg Ser Arg Phe lie Tyr lie Leu. 
35 ' . 40 . • • 45 



Lys Met lie Val Leu Trp Cys Phe Trp Pro Leu Asn He Ala' Val -Gly 
50 55 60 I ' 



Val He Ser Cys He Tyr Pro Pro Asn Thr Gly Gly Leu Val Ala Ala 
55 70 '75 .fiO 



He He Leu Thr Val Phe Ala Cys Leu Ser Phe Val Gly Tyr Trp He 

85 '90 .95 



Gin Ser Cys Arg Leu Phe Lys Arg Cys Arg Ser Trp Trp Ser Phe Asn 

100 105 110 



Pro Glu Ser Asn Ala Val Gly Ser He -Leu Leu Thr Asn Gly Gin Gin* 
115 120 125 



Cys Asn Phe Ala He Glu Ser Val Pro Met Val Leu Ala Pro He He 
130 135 • ' 140 



Lys .^^sn Gly Val Leu Tyr Cys Glu Gly Gin Trp Leu Ala Lys Cys Glu * 
145 150 • .155 ■ 160. 



• Pro Asp His Leu Pro Lys Asp He Phe Val Cys Thr Pro Asp Arg Arg 

165 ' 170 175 



Asn He Tyr Arg Met Val Gin Lys Tyr Thr Gly Asp Gin' Ser Gly Asn 

180 185 190 



Lys Lys Arg Val Ala Thr Phe Val Tyr Ala Lys Gin Ser Val Asp Thr 
195 200 205 



Gly Glu Leu Glu Ser Val Pro Thr Gly Gly Ser Ser Leu Tyr Thr 
210 215 220 



<210> 43 

<211> 455 

<212> PRT 

<213> Mouse Hepatitis Virus 

<400> 43 

Met Ser Phe Val Pro Gly Gin Glu Asn Ala Gly Ser Arg Ser Ser Ser 
1 5 10 • 15 
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Val Asn Arg Ala Gly Asp cly lie Leu Lys Lys Thr Thr Trp Ala Asp 

20 ' 25 • • , 30 . 



Gin Thr. Glu Arg Gly Pro Asn Asn Gin Asn Arg Gly Arg ' Arg Asn Gin * 
35 40 • ' 45 



• 1 



Pro Lys Gin Thr Ala Thr Thr Gin Pro Asn Ser Gly Ser Val Val Pro 
50 55 .60 



» « 



His Tyr Ser Trp Phe Ser Gly lie Thr Gin Phe Gin Lys Gly Lys Glu 
65 . 70 - -75 80 



Phe Gin Phe Ala Gin Gly. Gin Gly Val Pro lie Ala Asn Gly He Pro 

85 90 95 



Ala Ser Glu Gin Lys Gly Tyr Trp Tyr Arg His Asn Arg Arg Ser Phe 

• 100 105 110 



Lys Thr. Pro Asp Gly Gin Gin Lys .Gin Leu Leu Pro Arg Trp Tyr Phe 
115 120 125 • 



Tyr Tyr Leu Gly Thr Gly Pro His Ala Gly Al^ Glii Tyr Gly Asp Asp 
130 135 » 140 



He Asp Gly Val V^l Trp Val Ala Ser Gin Gin Ala Asp Thr Lys Thr 
145 • * 150 ■ 155 • , 160 



Thr Ala Asp He Val Glu Arg Asp. Pro Ser Ser His Glu Ala lie Pro 

165 170 175 



Thr Arg Phe Ala Pro Gly Thr Val Leu Pro Gin Gly Phe Tyr Val Glu 

180 185 190 



Gly Ser Gly Arg Ser Ala JPro Ala Ser Arg Ser Gly Ser Arg Ser Gin 
195 200 • ' 205 



Ser Arg Gly Pro Asn Asn Arg Ala Arg Ser Ser Ser Asn Gin Arg Gin 
210 215 220 



Pro Ala Ser Thr Val Lys Pro Asp Met Ala* Glu Glu He Ala Ala Leu 
225 230 235 240 



Val Leu Ala Lys Leu Gly Lys Asp Ala Gly Gin Pro Lys Gin Val Thr 

245 ■ 250 255 
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Lys Gin Ser Ala Lys Glu Val Arg Gin Lys .lie Leu Asn Lys Pro Arg 

. 260 265 • * 270 • 



Gin Lys Arg. Thr Pro Asn Lys Gin Cys.Pro- Val Gin Gin Cys Phe Gly 
•275 280 285 . . 



Lys Arg Gly Pro Asn Gin Asn Phe Gly Gly Ser Glu Met Leu |Lys Leu 
290 295 300 



Gly Thr Ser Asp Pro Gin Phe Pro He Leu Ala .Glu Leu Ala Pro Thr 
305 310 315 320 



Pro Ser Ala Phe Phe Phe. Gly Ser Lys Leu Glu Leu Val Lys Lys Asn 

325 ' • 330 335 



Ser Gly Gly Ala Asp Asp Pro Thr- Lys Asp Val Tyr Glu Leu Gin Tyr 

- 340 345 ' • 350 



Ser Gly Ala He Arg Phe Asp Ser Thr Leu Pro Gly Phe Glu Thr He 
355 360 365 



Met Lys Val Leu Asn Glu Asn Leu Asp Ala Tyr Gin Asp Gin Ala Gly 
370 375 380 



Gly Ala Asp Val Val Ser Pro Lys Pro* Gin Arg Lys Arg Gly Thr Lys 
385. 390 395 400 



Gin Lys Ala Leu Lys Gly Glu Val Asp Asn Val -Ser Val Ala Lys ?ro 

405 410 415 • 



Lys Ser Ser Val Gin Arg Asn Val Ser Arg Glu Leu Thr Pro Glu Asp 

420 425 430 



Arg Ser Leu Leu Ala Gin He Leu Asp Asp Gly Val Vai "Pro Asp Gly 
435 440 . 445 



Leu Glu Asp Asp Ser Asn Val 
450 455 ' 



<210> 44 

<211> 448 . • 

<212> PRT 

<213> Bovine coronavirus 

<400> 44 



Met Ser Phe Thr Pro Gly Lys Gin Ser Ser Ser Arg Ala Ser Ser Gly 
1 5 10 15 
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Asn Arg Ser Gly Asn Gly He Leii Lys Trp Ala Asp Gin Ser Asp Gin 
• 20 25 30 



Ser Arg Asn Val ,Gln Thr' Arg Gly Arg Arg Ala Gin Pro Lys Gin Thr 
■ 35 "" 40 45 



Ala Thr Ser Gin Gin Pro Ser Gly Gly Asn Val Val Pro Tyr Tyr Ser 
50 55 60 . 



Trp Phe Ser Gly He. Thr Gin Phe Gin Lys Gly Lys.Glu Phe GIm Phe 
65 . 70 75 80 



, Ala Glu Gly Gin Gly Val Pro He Ala Pro Gly Val Pro Ala Thr Glu 

85 - 90 95. 



Ala Lys Gly Tyr Trp Tyr Arg His Asn Arg Arg Ser Phe I^ys Thr Ala 

100 105 . ' 110 ' 



Asp Gly Asn Gin Arg Gin Leu Leu Pro Arg Trp Tyr Phe Tyr Tyr Leu 
115 120 125 



Gly Thr Gly Pro His Ala Lys Asp Gin Tyr Gly Thr Asp He Asp Gly 
130 135 140 



Val Tyr Trp Vafl Ala Ser Asn Gin Ala Asp Val Asn Thr Pro Ala Asp 
145 150 155 160 



ile Leu Asp Arg Asp Pro Ser Ser Asp Glu Ala He Pro Thr Arg Phe- 

165 170 175 . 



Pro Pro Gly Thr Val Leu Pro Gin Gly Tyr Tyr He Glu Gly Ser Gly 

180 185 190 



Arg Ser Ala Pro Asn Ser Arg Ser Thr Ser Arg Ala S6r Ser Arg Ala 
195 200 205 



Ser Ser Ala Gly Ser Arg Ser Arg Ala Asn Ser Gly Asn Arg Thr Pro 
210 215 220 



Thr Ser Gly Val Thr Pro Asp Met Ala Asp Gin He Ala Ser Leu Val 
225 230 235 240 



Leu Ala Lys Leu Gly Lys Asp Ala Ala Lys Pro Gin Gin Val Thr Lys 

245 250 255 
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Gin Thr Ala Lys Glu lie Arg Gin Lys lie Leu Asn Lys Pro Arg Gin 

260 ■ 265 ■ . . 270 



Lys Arg Ser Pro Asn Lys Gin Cys Thr Val Gin Gin Cys Phe Gly Lys 
275 . . 280 * 285 

I 

Arg Gly Pro Asn Gin Asn Phe Gly Gly Gly Glu Met Leu Lys Leu Gly 

290 295 300 

« 

Thr Ser Asp Pro Gin Phe Pro He Leu Ala Glu Leu Ala Pro Thr Ala 
305 310 315 320 



Gly Ala Phe Phe Phe Gly. Sef* Arg Leu Glu Leu Ala Lys Val Gin Asn 

325 330 335 



Leu Ser Gly Asn Leu Asp Glu Pro Gin Lys Asp Val Tyr Glu Leu Arg 

340 345 350 



Tyr Asn Gly Ala He Arg Phe Asp Ser Thr Leu Ser Gly Phe Glu Thr 
355 ■ . 360 365 



He Met Lys Val .Leu Asn Glu Asn Leu Asn Ala Tyr Gin Gin Gin Asp 
370 . 375 I 380 ' 



Gly . Thr Met Asn Met Ser Pro Lys Pro Gin Arg Gin Arg Gly Gin Lys 
385 390 ■ 395 -400 



Asn Gly Gin Gly Glu Asn Asp Asn. He Ser Val Ala Ala Pro Lys Ser 

405 410 415 



• Arg Val Gin Gin Asn Lys He Arg Glu Leu Thr Ala Glu Asp He Ser 

420 • 425 430 



Leu Leu Lys Lys Met Asp Glu Pro Phe Thr Glu Asp Thr Ser Glu He 
435 440 445 . 



<210> 45 

<211> 409 

<212> PRT 

<213> Avian infectious bronchitis virus 

<400> 45 

Met Ala Ser Gly Lys Ala Ala Gly Lys Thr Asp Ala Pro Ala Pro Val 
1 5 10 .15 



He Lys Leu Gly Gly Pro Lys Pro Pro Lys Val Gly Ser Ser Gly Asn 
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. • • 

• « 

• « . . 

20 25 30 ■ 

'* • . • * ■ 

Ala Ser Trp Ph$ Gin Ala Leu Lys Ala Lys Lys I>eu Asn Ala Fro Ala . 
35 40 • . . 45 •■. 

Pro Lys Phe Glu 'ciy Ser Gly Val Pro Asp Asn Glu Asn Leu Lys- lie 
50 * 55 ■ 60 



Ser Gin Gin His Gly Tyr Trp Arg Arg Gin Ala Arg Tyr Lys Pro Gly 
^5 70 75 .80 



Lys Gly Gly Arg Lys Pro Val Pro Asp Ala Trp Tyr Phe Tyr Tyr Thr 

85 ■ .90 95 ' 



Gly Thr Gly Pro Ala Ala Asp Leu Asn Trp Gly Asp Ser Gin Asp Gly 

100 105 ' 110 



lie Val Trp Val Ala Ala Lys Gly Ala Asp Val Lys Ser Arg Ser Asn' 
115 120 ' 125 



Gin Gly Thr Arg Asp Pro Asp Lys Phe Asp Gin Tyr Pro Leu Arg Phe 

130- - 135 •140 • 



Ser J^sp Gly Gly Pro Asp Gly Asn Phe Arg Trp Asp Phe* .He Pro Leu 
145 150 155 160 



Asn Arg Gly Arg Ser Gly Arg Ser Thr Ala Ala Ser Ser Ala Ala Ser 

165 170 175 



, Ser Arg Ala Pro Ser Arg Glu Gly Ser Arg Gly Arg Leu Asn Gly Ala 

• 180 185 190 



Glu Asp Asp Leu He Ala Arg Ala Ala Lys He He Gin Asp Gin Gin 
195 200 205 



Lys Lys Gly Ser Arg He Thr Lys Al^ Lys Ala Glu Glu Met He His 
210 . . * 215 220 



Arg Arg Tyr Cys Lys Arg Thr Val Pro Pro Gly Val Ser He Asp Lys 
225 230 235 240 



Val Phe Gly Pro Arg Thr Lys Gly Lys Glu Gly Asn Phe Gly Asp Asp 

245 250 255 



Lys Met Asn Glu Glu Gly He Lys Asp Gly Arg Val Thr Ala Met Leu 

260 265 270 
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Asn Leu Val Pro Ser Ser His Ala Cys Leu Phe Gly .Ser Gin Val Thr 
. 275 280 2B5 



Pro Lys Leu Gin P.ro Asp Gly Leu His Leu Thr Phe Arg Phe Thr Thr 
290 . 295 300 ' • 

I 

Val Val Ser Arg Asp Asp Pro Gin Phe Asp Asn Tyr Val Lys lie Cys 
305 310 • 315 • 320 



Asp- Glu Cys Val Asp Gly Val Gly Thr Arg Pro Lys Asp Glu Val Val 

325 330 335 



Arg Pro Lys Ser Arg Ser Ser Ser Arg Pro Ala Thr Arg Gly Thr Ser 

340 345 350 



Pro Ala Pro Lys Gin Gin Arg Pro Lys Lys Glu Lys Lys Pro Lys Lys 
355 360 . 365 



Gin Asp Asp Glu Val Asp Lys Ala Leu Thr Ser Asp Glu Glu Arg Asn 
370 '375 380 



Asn Ala Gin Leu Glu Phe Asp Asp Glu Pro Lys Val lie Asn Trp Gly 
385 390 395 ' 400 



Asp Ser Ala Leu Gly Glu A'sn Glu Leu 

405 



<210> 4 6 

<211> 376 

<212> PRT ■ 

<213> Feline coronavirus 

<400> 46 ' 

Met Ala Thr Gin Gly- Gin Arg Val Asn Trp Gly Asp Glu Pro Ser Lys 
1 .5 10 15 



Arg Arg Gly Arg Ser Asn Ser Arg Gly Arg Lys Asn Asn Asp lie Pro 

20 25 30 



Leu Ser Tyr Phe Asn Pro lie Thr Leu Asp Gin Gly Ser Lys Phe Trp 
35 40 45 



Asn Leu Cys Pro Arg Asp Phe Val Pro Lys Gly lie Gly Asn Lys Asp 
50 55 60 
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Gin Gin lie Gly Tyr Trp Ash Arg Gin. Ala Arg Tyr Arg lie Val Lys 
65 ' . 70 • * 75 ■ • ' • 80 



Gly Gin Arg Val Glu Leu Pro Glu Arg Trp Phe Phe Tyr iPhe Leu Gly 

85 90 95 



Thr Gly Pro His Ala Asp Ala Lys Phe Lys Ala .Lys lie Asp Gly Val' 

' 100 , ' • . 105 ' . 110 



Phe Trp Val Ala Arg Asp Gly Ala Met Asn Lys Pro Thr Ser Leu .Gly 
• 115 . 120 125 



Thr Arg Gly Thr Asri Asn Glu Ser Lys Pro Leu Lys Phe Asp Gly Lys 
130 • 135 140 



lie Prp Pro Gin Phe Gin Leu Glu Val Asn Arg Ser Arg Asn Asn Ser • 
145 , . . 156 155 160 



Arg Ser Gly Ser Gin Ser Arg Ser Val Ser Arg Asn Arg. Ser Gin Ser 

165 170 175' 



Arg Gly Arg Gin Gin Ser Asn Asn Gin Asn Thr Asn Val Glu Asp Thr 

180 ' 185 190 *. 



lie Val Ala Val Leu Gin Lys Leu Gly Val Thr Asp* Lys Gin Arg Ser 
195 200 • • 205 



Arg Ser Lys Ser Gly Glu Arg Ser Gin Ser Lys Ser Arg Asp Thr Thr 
210 215 220 



Pro Lys Asn Ala Asn Lys His Thr Trp Lys Lys Thr Ala Gly Lys Gly 
225 230 235 . 240 



Asp Val Thr Asn Phe Tyr Gly Ala Arg Ser Ser Ser Ala Asn Phe Gly 

245 - 250 255 



Asp Ser Asp Leu Val Ala Asn Gly Asn Ala Aia Lys Cys Tyr Pro Gin 

260 265 270 



lie Ala Glu Cys Val Pro Ser Val Ser Ser lie Leu Phe Gly Ser <51n 
275 280 285 



Trp Ser Ala Glu Glu Ala Gly Asp Gin Val Lys Val Thr Leu Thr His 
290 295 300 



Asn Tyr Tyr Leu Pro Lys Asp Asp Ala Lys Thr Ser Gin Phe Leu Glu 

77 
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305 310 315 320 



Gin lie Asp Ala Tyr Lys Arg Pro Ser Glu Val Ala Lys Asp Gin Arg. 

325 "3.30 . 335 



Gin Arg Lys Ser Arg Ser Lys Ser Ala Asp Lys Lys Pro Glu* Glu 'Leu 

340 • 345 350( 

Ser Val Jhr Leu Glu Ala Tyr Thr Asp Val Phe Asp Asp Thr Gin Val 
355 360 =365 



Glu Met He Asp Glu Val Thr Asn 

370 • ' 375 . 



<210> 47 

<211> .382 . . 
<212> PRT 

<213> porcine transmissible gastroenteritis virus 
<400> 47 • ' • 

• > 

Met Ala Asn Gin Gly. Gin Arg Val Ser Trp Gly Asp Glu Ser Thr Lys 
1 5 10- 15 ' - 



Thr Arg Gly Arg Ser Asn Ser Arg Gly Arg Lys Asn Asn Asn lie Pro 
/' 20 25 ■ ' 30 • 



Leu Ser Phe Phe Asn Pro lie Thr Leu. Gin Gin Gly Ser Lys Phe Trp 
35 40 45 



Asn Leu Cys Pro Arg Asp Phe Val Pro Lys Gly He Gly Asn Arg Asp 
50 55 60 



Gin Gin He Gly Tyr Trp. Asn Arg Gin Thr Arg Tyr Arg Met Val Lys 
65 70 75 80 



Gly Gin Arg Lys Glu Leu Pro Glu Arg Trp Phe Phe Tyr Tyr Leu Gly 

85 90 • 95 ' 



Thr Gly Pro His Ala Asp Ala Lys Phe Lys Asp Lys Leu Asp Gly. Val 

100 105 110 



Val Trp Val Ala Lys Asp Gly Ala Met Asn Lys Pro Thr Thr Leu Gly 
115 120 125 



Ser Arg Gly Ala Asn Asn Glu Ser Lys Ma Leu Lys Phe Asp <51y Lys 
130 135 140 
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Val Pro Gly Glu Phe Gin Lfeu Glu Val Asn Gin Ser Arg hsp Asn Ser 
145 • 15t). • .155 160 



Atg Leu. Arg Ser Gin Ser Arg Ser Arg Ser Arg Asn Arg* Ser Gin Ser 

. .165 .170 175 • 

Arg Giy Arg Gin Gin Ser Asn Asn Lys Lys Asp Asp Ser Val Glu Gin 

180 : 185 * . 190 



Ala Val Leu Ala Ala Leu Lys Lys Leu Gly Val Tyr Thr Glu Lys Gin 
195 200 205 



Gin Gin Arg Ser Arg Ser. Lys* Ser Lys Glu Arg Ser Asn Ser Lys He 
210 215 220 



Arg Asp Thr Th:f Pro Lys Asn Glu Asn Lys His Thr Trp Lys Arg Thr 
225 230 . . . 235 240 



Ala Gly Lys Gly Asp Val Thr Arg Phe Tyr Gly Thr Arg Ser Asn Ser 

245 250 255 



Ala Asn Phe Gly Asp Ser Asp Leu Val Ala Asn. Gly Ser Ser Ala Lys 

. 260 • 265 .270 



His Tyr Pro Gin Leu Ala Glu Cys Val Pro Ser Val Ser Ser He Leu 
•275 •* 280 285 



•I 

« 



Phe Gly Ser Tyr Trp Thr Ser Lys. Glu Asp Gly Asp Gin He Glu Val 
• 290 295 300 



Thr Phe Thr His Lys Tyr His Leu Pro Lys Asp Asp Pro Lys Thr Gly 
305 310 315 320 



Gin Phe Leu Gin Gin He Asn Ala Tyr Ala Arg Pro Ser Glu Val Ala 

325 330 335 



Lys Glu Gin Arg Lys Arg Lys Ser Arg Ser Lys Ser Ala Glu Arg Ser 

340 345 350 



Glu Gin Glu Val Val Pro Asp Ala Leu He Glu Asn Tyr Thr Asp Val 
355 360 365 



Phe Asp Asp Thr Gin Val Glu Met He Asp Glu Val Thr Asn 
370 375 380 
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^ * 

<210> 48 • . 

<211>* 389 * 
<212> PRT 

<213> Hiaman cpronavirus 229E ■ 

■ 

- 

<400> 48 

Met Aia Thr Vai Lys Trp Ala Asp Ala Ser Glu Pro Gin Arg Gly Arg 

1.5. 10 (15 • • 

C31n Gly Arg He Pro Tyr Ser Leu Tyr Ser Pro Leu Leu Val Asp Ser 

20 25 * • 30 . • 

. « • 

Glu Gin Pro Trp Lys Val lie Pro Arg Asn Leu Val Pro lie Asn Lys 
35 40 • ■ 45 

m * 

a 

■ 

Lys Asp Lys Asn Lys Leu lie Gly Tyr Trp Asn Val Gin Lys Arg Phe 
50. 55 60 

■ 

Arg Thr Arg Lys Gly Lys Arg Val Asp Leu Ser Pro Lys Leu His Phe" 
65 .70 75 • . ■ 80 • 



Tyr Tyr Leu Gly Thr Gly Pro His Lys Asp Ala Lys Phe Arg Glu Arg 

85 ' SO 95 



m » 



Val q,iyy Gly Val. Val Trp Val* Ala Val Asp Gly Ala Lys Thr Glu Pro 

100 • . . 105 110 



Thr Gly Tyr Gly Val Arg Arg Lys Asn Ser Glu Pro Glu He Pro His 
115 120 125 



Phe Asn Gin Lys Leu Pro Asn Gly Val Thr Val Val Glu Glu Pro Asp • 
130 135 '140 



Ser Arg Ala Pro Ser Arg Ser Gin Ser Arg Ser Gin Ser Arg Gly Arg 
145- 150' 155 160 



Gly Glu Ser Lys Pro Gin Ser Arg Asn Pro Ser Ser Asp Arg Asn His' 

165 170 175 



Asn Ser Gin Asp Asp He* Met Lys Ala Val Ala Ala Ala Leu Lys Ser 

180 185 190 



Leu Gly Phe Asp Lys Pro Gin Glu Lys Asp Lys Lys Ser Ala Lys Thr 
195 200 .205 



Gly Thr Pro Lys Pro Ser Arg Asn Gin Ser Pro Ala Ser Ser Gin Thr 
210 215 220 
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* 

Ser Ala Lys Ser Leu Ala Ar.g Ser Gin Ser Ser Glu, Thr Lys Glu Gin 
225 ■ * 230 235 ■ • 240 



Lys His Glu Met Gin Lys Pro Arg Trp Lys Arg Gin Pro Asn Asp -Asp 

■ '"i45 • 250 255 



Val Thr Ser Asn Val Thr Gin Cys Phe Gly Pro Arg Asp Leu Asp His 

260 265 270 



Asn Phe Gly Ser Ala Gly Val Val Ala Asn GLy Val Lys Ala Lys Gly 
275 280' • • 285 • 



.Tyr Pro Gin Phe Ala Glu Leu Val Pro Ser Thr Ala Ala Met Leu Phe 
290 ' 295 300 



Asp Ser His lie Val Ser Lys Glu Ser Gly Asn Thr Val Val Leu Thr 
305 310 ' 315 320 



Phe Thr Thr Arg Val Thr Val Pro Lys Asp His Pro His Leu Gly Lys 

325 -330 ' 335 



Phe Leu .Glu Glu Leu Asn Ala Phe Thr Arg Glu Met Gin Gin His Pro 

340 345 350 



Leu Leu Asn Pro Ser Ala Leu Glu Phe Asn Pro Ser Gin Thr Ser- Pro 
355 ' 360 365 



Ala Thr Ala Glu Pro Val Arg Asp Glu Val Ser He. Glu Thr Asp He 
. 370 375 380 • 



He Asp Glu Val Asn 
385 



.<210> 49 . 

<211> 448 

<212> PRT ' . 

<213> Human coronavirus 

<400> 49 

Met Ser Phe Thr Pro Gly Lys Gin Ser Ser Ser Arg Ala Ser Ser Gly 
1 5.. 10 15 



Asn Arg Ser Gly Asn Gly He Leu Lys Trp Ala Asp Gin Ser Asp Gin 

20 25 30 
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• m 

* * 4 

Val Arg Asn Val Gin Thr Ar'g Gly Arg Arg Ala Gin Pro Lys Gin Thr 
35 ... 40 •45 



Ala Thr Ser Gin Gin Pro Ser Gly Gly Asn Val Val Pro-'Tyr Tyr Ser 
50 55 • £0 . ' , 



Trp Phe Ser Gly He Thr Gin Phe Gin Lys Gly Lys Glu Ph^ Glu Phe* 
65 70 . ■ .75 . 80 

* 

Val eiu Gly Gin Gly Pro Pro. He* Ala' Pro Gly Val Pro Ala Thr (Slu 

85 90 - 95 



Ala Lys Gly Tyr Trp Tyr Arg His Asn Arg Gly Ser Phe Lys Thr Al^ 

100 105 110 



■ 

Asp Gly Asn Gin Arg Gin Leu Leu Pro Arg Trp Tyr Phe Tyr Tyr Leu 

120 125 



Gly Thr Gly Pro His Ala Lys Asp Gin Tyr Gly Thr Asp He Asp Gly " 
130 135 140 



Val Tyr Trp Val Ala Ser Asn Gin Ala Asp Val Asn Thr Pro Ala Asp • 
145 .150 ... .155 • 160 • ' 



He Val Asp Arg Asp Pro Ser Ser Asp Glu Ala He Pro Thr Arg Phe. 

165 170 . • 175 



Pro Pro Gly Thr Val Leu Pro Gin Gly Tyx Tyr He Glu Gly Ser Gly 

180 • 185 190 ' 



Arg Ser Ala Pro Asn Ser Arg Ser Thr Ser Arg Thr Ser Ser Arg Ala 
195 -200 205 



Ser. Ser Ala Gly Ser Arg Ser Arg Ala Asn Ser Gly Asn Arg Thr Pro 
210 215 220 ' 



Thr Ser Gly. Val Thr Pro Asp Met Ala Asp Gin He Ala Ser Leu Val • 
225 230 235 240 



Leu Ala Lys Leu Gly Lys Asp Ala Thr Lys Pro Gin Gin Val Thr Lys 

245 250 255 



His Thr Ala Lys Glu Val Arg Gin Lys He Leu Asn Lys Pro Arg Gin 

260 265 270 



Lys Arg Ser Pro Asn Lys Gin Cys Thr Val Gin Gin Cys Phe Gly Lys 
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• • • " 

275 280 .285 

Arg Gly Pro Asn Gin Asn Phe Gly Giy Gly GIu Met Leu l>ys Leu Gly 
290 ' 295 300 

,» 

Thr Ser Asp Pr6*'Gln Phe Pro He Leu Ala Glu Leu Ala Pro Thr Ala 
305 310 315 320 



Gly Ala Phe Phe Phe Gly Ser Arg Xeu Glu Leu Ala Lys Val Gin Asn 

325 330- 335 



Leu Ser Gly Asn Pro Asp Glu Pro * Gin Lys Asp Val Tyr Glu Leu Arg 

• 340 345 350 



Tyr Asn Gly Ala He Arg Phe Asp Ser Thr Leu Ser Gly Phe Glu Thr 
355 360 365 



He Met Lys Val Leu Asn Glu Asn Leu Asn Ala Tyr Gin Gin Gin Asp ' 
370 375 380 



Gly Met Met Asn Met Ser Pro Lys Pro Glh Arg Gin Arg Gly His Lys 
385 .390 395 400 

* m 

I 

Asn Gly Gin Gly Glu Asn Asp Asn He Ser Val Ala Val Pro Lys Ser 

. 405 * ' 410 415 



Arg Val Gin Gin Asn- Lys Ser Arg Glu Leu Thr Ala Glu Asp He Ser 

420 425 430 • 



Leu Leu Lys Lys Met Asp Glu Pro Tyr Thr Glu Asp Thr Ser Glu He 
435 440 445 



<210> 50 • 

<211> 449 • 

<212> PRT 

<213> porcine hemagglutinatlng encephalomyelitis 

<400> 50 . * . 

Met Ser Phe Thr Pro Gly Lys Gin Ser Ser Ser Arg Ala Ser Ser Gly 

1 5 ' 10 - 15 



Asn Arg Ser Gly Asn Gly He Leu Lys Trp Ala Asp Gin Ser Asp Gin 

20 25 30 



Ser Arg Asn Val Gin Thr Arg Gly Arg Arg Val Gin Ser Ijiys Gin Thr 
35 40 45 
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Ala Thr Ser Gin Gin Pro Ser Gly Gly Thr Val Val Psro Tyr Tyr Ser. 
50 55 60 



T!xp Phe Ser- Gly lie Thr Gin Phe Gin Lys Gly Lys Glu Phe Glu Phe 
65 .70 '75 * 80 



Ala Glu Gly Gin Gly Val Pro lie Ala Pro Gly Val Pro Ser Thr Glu 

85 90 . ' • 95 



Ala Lys Gly Tyr Trp Tyr Arg His Asn Arg Arg Ser Phe I*ys Thr Ala 

100 105 • XIO 



Asp Gly Asn Gin Arg Gin, Leu* Lea Pro Arg Trp Tyr Phe Tyr Tyr Leu 
115 .120 • ■ 125 



Giy Thr Gly Pro His Ala Lys Asp Gin Tyr Gly Thr Asp lie Asp Gly 
130 135 140 



Val Phe Trp Val Ala Ser Asn Gin Ala Asp lie Asn Thr Pro Ala Asp 
145 150 155 * - * 160 



lie. Val Asp Arg Asp Pro Ser Ser Asp Glu Ala lie Pro Thr Arg Phe 

1-65 » 170 • 175 ' 



Pro Pro Gly Thr Val Leu Pro Gin Gly Tyr Tyr lie Glu Gly Ser Gly 

180 185 ■ ' 190 



Arg Ser Ala Pro Asn Ser Arg Ser . Thr Ser Arg Ma Pro Asn Arg Ala 
195 200 .205 



Pro Ser Ala Gly Ser Arg Ser Arg Ala Asn Ser Gly Asn Arg Thr Ser 
210 215 220 



Thr Pro Gly Val Thr- Pro Asp Met Ala Asp Gin He' Ala Ser Leu Val 
225 230 ' 235 240 



Leu Ala Lys Leu Gly Lys Asp Ala Thr Lys Pro Gin Gin Val Thr Lys 

245 250 255 



Gin Thr Ala Lys Glu Val Arg Gin Lys He Leu Asn Lys Pro Arg Gin 

260 265 270 



Lys Arg Ser Pro Asn Lys Gin Cys Thr Val Gin Gin Cys Phe Gly Lys 
275 280 285 
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• * • * ■ - 

Arg Gly Pro Asn Gin Asn Phe Gly Gly Gly Glu Met Leu Lys lieu Gly 
290 285 • 30a 



Thr Ser Asp Pro Gin Phe Pro He Leu Ala Glu Leu Ala Pro Thr Ala 

305 310 315 320 

• ■ * . • 

Gly Ala Phe Phe Phe Gly Ser Arg Leu Glu Leu Ala Lys Val.Gln Asn 

325 . • 330 ' 335 



Leu Ser Gly Asn Pro Asp Glu Pro Gin Lys Asp Val Tyr Glu Leu Arg 

340 345 350 



Tyr Asn Gly Ala He Arg Phe Asp Ser Thr Leu Ser Gly Phe Glu Thr 
355 360 365 



He Met Lys Val Leu Asn Gin Asn Leu Asn Ala Tyr Gin* His Glh Glu 
370 375 380 . 



Asp Gly Met Met Asn He Ser Pro Lys Pro Gin Arg Gin Arg Gly Gin 
385 390 395 400 



Lys Asn Gly Gin Val Glu Asn Asp Asn Val Ser Val Ala Ala Pro Lys 

405 410 415 



Ser Arg Val Gin Gin Asn Lys Ser Arg Glu Leu Thr Ala Glu Asp lie . 

420 425 .430 



Ser Leu Leu Lys Lys Met Asp Glu Pro Tyr Thr Glu TVsp Thr Ser Glu 
435 440 445 



He 



<210> 51 

<211> 409 

<212> PRT 

<213> turkey coronavirus 

<400> 51 * 

Met Ala Ser Gly Lys Ala Thr Gly Lys Thr Asp Ala Pro Ala Pro He 

1 ' 5 . - 10 . 15 



He Lys Leu Gly Gly Pro Lys Pro Pro Lys Val Gly Ser Ser Gly Asn 

20 25 30 



Ala Ser Trp Phe Gin Ser He Lys Ala Lys Lys Leu Asn Ser Pro Gin 
35 40 , 45 • 
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Pro Lys Phe Glu Gly Ser .Gly Val Pro A^p Asn Glu .Asn He ,Lys Thr 
50 55 60 • 



Ser Gin Gin His Gly Tyr Trp Arg Arg Gin Ala Arg Phe Iiys Pro Gly 
65 70 • ' 75 -80 

r 

Lys Gly Gly Arg Lys Pro Val Pro Asp Ala Trp Tyr Phe Tyr Tyr Thr 

85 '90 95 



Gly Thr Gly Pro Ala Ala Asp Leu Asn Trp Gly Asp Thr Gin Asp Gly 

100 105 110 



lie Val Trp Val Ala Ala Lys Gly Ala Asp Val Lys Ser Arg Ser Asn 
115 * 120 * 125 



Gin Gly Thr Arg Asp Pro Asp Lys Phe Asp Gin Tyr Pro Leu Arg Phe 
130 135 140 



Ser Asp Gly Gly Pro Asp Ser Asn Phe Arg Trp Asp Phe He Pro Leu 
145 150 155 160 



His Arg Gly Arg Ser Gly Arg Ser .Thr Ala Ala Ser Ser Ala Ala Ser 

165 170 175 



Ser Arg Ala Pro Ser Arg Asp Gly Ser Arg Gly Arg Arg Ser Gly Ser 

180 185 190 



Glu Asp Asp Leu He Ala Arg Ala Ala Lys He He Gin Asp Gin Gin 
195 200 ■ . 205 



Lys Lys Gly Ser Arg He Thr Lys Ala Lys Ala Asp Glu Met Ala His 
210 215 . 220 



Arg Arg Tyr Cys Lys Arg Thr Val Pro Pro Gly Tyr Lys Val Asp Gin 
225 230 235 240 



Val Phe Gly Pro Arg Thr Lys Gly Lys Glu Gly Ash Phe Gly Asp Asp 

245 250 255 



Lys Met Asn Glu Glu Gly He Lys Asp Gly Arg Val Thr Ala Met Leu 

260 265 270 



Asn Leu Val Pro Ser Ser His Ala Cys Leu Phe Gly Ser Arg Val Thr 
275 280 285 
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■ • 



Pro Lys Leu- Gin Pro- Asp Qly Leu His. Leu Arg- Phe Glu Phe Thr Thr 
290 ' 295 300 



Val Val Pro Arg Asp Asp Pro Gin Phe Asp Asn Tyr Val Thr .lie Cys 

305 310 315 320 

Asp Gin Cy3 Val Asp Gly lie Gly Thr Arg Pro* Lys Asp Asn Glu Pro 

325 330 • • 335 



Aarg Pro Lys Ser Arg Pro Ser Ser Arg Pro Ala Thr Arg Gly ^sn Ser 

340 345 350 



Pro Ala Pro Arg Gin Gin Arg Pro Lys Lys Glu Lys Lys Pro Lys Lys 
355 360 365 



Gin Asp Asp Glu Val Asp Lys Ala Leu Thr. Ser Asp Glu Glu Arg Asn 
370 • ' 375 380 



Asn Ala Gin Leu Glu Phe Asp Asp Glu Pro Lys Val lie Asn Trp Gly 
385 . ' . 390 ■ 395 400. 



Asp Ser Ala Leu Gly. Glu Asxt His Leu 

405 



<210> 52 
<211> 1173 . 
<212> PRT 

<213> Human coronavirus 22 9E 
<:400> 52 • 

■ 

Met Phe Val Leu Leu Val Ala Tyr Ala Leu Leu His lie Ala Gly Cys 
1 5 10 . 15 • * 



Gin. Thr Thr Asn Gly Leu Asn Thr Ser Tyr Ser Val Cys Asn Gly Cys 

20 25 30 



Val Gly Tyr Ser Glu Asn Val Phe Ala Val Glu Ser Gly Gly Tyr lie 
35 40 ' 45 



Pro Ser Asp Phe Ala Phe Asn Asn Trp Phe Leu Leu Thr Asn Thr Ser 
50 55 60 



Ser Val Val Asp Gly Val Val Arg Ser Phe Gin Pro. Leu Leu Leu Asn 
65 70 75 80 



Cys Leu Trp Ser Val Ser Gly Leu Arg Phe Thr Thr Gly Phe Val Tyr 



87 



wo 2004/096842 PCT/CA2004/000626 

85 90 95 



Phe Asn Gly Thr Gly Arg Gly Asp Cys Lys Gly Phe Ser Ser Asp Val 

• 100 * ■ 105 . 110 



Leu Ser Asp Val lie Arg Tyr Asn Leu Asn Phe Glu Glu Asn* Leu Arg 
115 120 125 I 

■ 

Arg. Gly Thr He Leu Phe Lys Thr Ser Tyr Gly Val Val Val Phe Tyr 
130 135 140 



Cys Thr Asn Asn Thr Leu Val Ser- Gly Asp Ala His He. Pro- Phe Gly 
145 150. 155 160 



Thr Val Leu Gly Asn Phe Tyr Cys Phe Val Asn Thr Thr He Gly Thr 

165 no 175 



Glu Thr Thr Ser Ala Phe Val Gly Ala Leu Pro Lys Thr Val Arg Glu 

180 185 190 



Phe Val He Ser Arg Thr Gly His Phe Tyr -He Asn Gly Tyr Arg Tyr 
195 200 205 

I . ' ' . 

Phe Thr l*e\x Gly Asn Val Glu- Ala Val Asn Phe Asn Val Thr Thr Ala 
210 215 220 . 



Glu Thr Thr Asp Phe Phe Thr Val Ala Leu Ala Ser Tyr Ala Asp Val 
225 230 ' • 235 ' 240 



Leu Val Asn Val Ser Gin Thr Ser lie Ala Asn He He Tyr Cys Asn 

245 250 255 



Ser Val He Asn* Arg Leu Arg Cys Asp Gin Leu Ser Phe Tyr- Val Pro 

'260 265 270 



Asp Gly Phe Tyt Ser Thr Ser Pro He Gin Ser Val Glu Leu Pro Val 
275 280 285 



Ser He Val Ser Leu Pro Val Tyr His Lys His Met Phe He Val Leu 
290 295 300 



Tyr Val Asp Phe Lys Pro Gin Ser Gly Gly Gly Lys Cys Phe Asn Cys 
305 310 315 320 



Tyr Pro Ala Gly Val Asn He Thr Leu Ala Asn Phe Asn Glu Thr Lys 

325 330 335 
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Gly Pro Leu Cys Val Asp Thr Ser His Phe .Thr Thr Lys Tyr Val Ala 
• 340 . ' 345 350 



Val Tyr Ala' Asn Val Gly Arg Trp Ser Ala Ser lie Asn* Thr Gly Asn 
355 •••• 360 365 • ■ . 



Cys Pro Phe Ser Phe Gly Lys Val Asn Asn Phe Val Lys Phe Gly Ser 
370 . 375 380 . 



Val Cys Phe.Ser Leu Lys Asp lie Pro Gly Gly Cys. Ala Met ,Pro lie 
385 . 390 395 400 



Val Ala Asn Trp Ala Tyr Ser Lys Tyr Tyr Thr lie Gly Thr Leu Tyr 

405 • 410 415 



Val Ser Trp Ser Asp Gly Asp Gly lie Thr Gly Val Pro Gin Pro Val 

420 425- .430 



Glu Gly Val Ser Ser Phe Met Aan Val Thr Leu Asp Lys Cys Thr Lys . 
435 . 440 . 445 



Tyr Asn He Tyr Asp Val Ser Gly Val Gly Val lie Arg Val Ser Asn 
6*50 ' 455 460 • 



Asp Thr Phe Leu Asn Gly lie Thr Tyr Thr - Ser Thr Ser Gly Asn Leu 
465 " . 470 475 480 



leu Gly Phe Lys Asp Val Thr Lys Gly Thr He Tyr Ser He Thr Pro 

485 490 495 . 



Cys Asn Pro Pro Asp Gin. Leu Val Val Tyr Gin Gin Ala Val Val Gly 

500 505 510 



Ala Met Leu Ser Glu Asn Phe Thr Ser Tyr Gly Phe Ser Asn Val Val 
515 520 525 ' 



Glu Leu Pro Lys Phe Phe Tyr Ala Ser Asn Gly Thr Tyr Asn Cys Jhr 
530 535 540 



Asp Ala Val Leu Thr Tyr Ser Ser Phe Gly Val Cys Ala Asp Gly Ser 
545 550 555 560 



* 



He He Ala Val Gin Pro Arg Asn Val Ser Tyr Asp Ser Val Ser Ala 

565 570 575 
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He Val Thr-Ala Asn Leu Ser lie Pro Ser Asi> Trp Thr lie Ser Val 

580 '. 585 590 - 



Gin Val Glu Tyr Leu Gin He Thr Ser Thr Pro He Val VaJL Asp Cys . 
595" 600 605 

- * I ' 

Ser Thr Tyr Val Cys Asn Gly Asn Val Arg Cys Val Glu Leu Leu Lys 

610 615 620 



Gin Tyr Thr Ser Ala Cys. Lys Thr He Glu Asp Ala Leu Arg Asn Ser 
625 . 630 635 640 



Ala, Arg Leu Glu Ser Ala Asp Val Ser Glu Met Leu Thr Phe Asp Lyis 

645 6S0 655 



Lys Ala Phe Thr Leu Ala Asn Val Ser Ser. Phe Gly Asp Tyr Asn Leu 

660 665 610 



Ser Ser Val. He Pro Ser Leu Pro Thr Ser Gly Ser Arg Val Ala <31y 
675 660 685 



Arg Ser Ala lie Glu Asp He Leu Phe Ser Ly§ He Val Thr Ser Gly 
' 690' 695 700 • 



Leu Gly Thr Val Asp Ala Asp- Tyr Lys Asn (Jys Thr Lys. Gly Leu Ser 
705 . 710 . 715 720 



He Ala Asp Leu Ala Cys Ala Gin Tyr Tyr Asn Gly He Met Val Leu 

72$ 730 • 735 



Pro Gly Val Ala .Asp Ala Glu' Arg Met Ala Met Tyr Thr Gly Ser Leu 

740 . ' 745 - 750 



He Gly Gly He Ala Leu Gly Gly Leu Thr Ser Ala Val Ser He Pro 
755 760 765 



Phe Ser Leu Ala He Gin Ala Arg Leu Asn Tyr Val Ala Leu Gin Thr 
770 775 780 



Asp Val Leu Gin Glu Asn Gin Lys He Leu Ala Ala Ser Phe Asn Lys 
785 790 795 800 



Ala Met Thr Asn He Val Asp Ala Phe Thr Gly Val Asn Asp Ala He 

805 810 815 
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1 • ^ - 

Thr Gin Thr Ser Gin Ala Leu Gin Thr Val Ala Thr Ala I^eu Asn Lys 

820. * 825 830.- 



lie Gin Asp -Val Val Asn Gin Gin Gly Asn Ser Leu Asn His Leu Thr 
• 83& 840 845' 

• . t • • • • 

\ 

Ser Gin Leu Arg Gin Asn Phe Gin Ala lie Ser Ser Ser lie Gin Ala 
. 850 . 855 860 



lie Tyr Asp Arg Leu Asp Thr He Gin Ala Asp jGln Gin Val Asp Arg 
865 870 875 . 880 



Leu He Thr Gly Arg Leu Ala Ala Leu Asn Val Phe Val Ser His Thr 

885 890 895 



Leu Thr Lys Tyr Thr Glu Val Arg Ala Ser Arg Gin Leu Ala Gin Gin . 

90q .905 910 



Lys Val Asn Glu Cys Val Lys Ser Gin Ser Lys Arg Tyr Gly Phe Cys 
915 920 925 



Gly Asn Gly Thr His .He Phe Ser He Val Asm Ala Ala Pro Glu Gly 
930 . . 935 . . 940 

Leu Val Phe Leu His Thr Val Leu Leu Pro Thr Gin Tyr Lys Asp Val 
945 950 955 960 



Glu Ala Trp Ser Gly Leu Cys Val Asp Gly Thr Asn Gly Tyr Val Leu 

965 . 970 975 



Arg Gin Pro Asn Leu Ala Leu Tyr Lys Glu Gly Asn Tyr Tyr Arg He 

980 985 990 



Thr Ser Arg He Met Phe Glu Pro Arg He Pro Thr Met Ala Asp Phe 
995 1000 1005 



■ • 



Val Gin He Glu Asn Cys Asn Val Thr Phe Val Asn He Ser Arg 
1010 1015 1020 



Ser Glu Leu Gin Thr He Val Pro Glu Tyr He Asp Val Asn Lys 
1025 1030 1035 



Thr Leu Gin Glu Leu Ser Tyr Lys Leu Pro Asn Tyr Thr Val Pro 
1040 1045 1050 



Asp Leu Val Val Glu Gin Tyr Asn Gin Thr He Leu Asn Leu Thr 
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1055 1060 1065 



Ser Glu He Ser Thr Leu Glu Asn Lys Ser Ala Glu Leu Asn Tyr • . 
1070 1075 • • 1080 •*. 



Thr Val Gin Lys Leu Gin Thr Leu He Asp Asn He Asn Ser a?hr 
1085 1090 1095 .| 



Leu Val . Asp Leu Lys Trp Leu Asn Arg Vai Glu Thr Tyr He Lys 
1100 • 1105 1110 



Trp. Pro Trp Trp Val Trp Leu Cys He Ser Val Val Leu He.Phe* 
1115 1120 1125 



Val Val Ser Met Leu Leu Leu* Cys Cys Cys Ser Thr Gly Cys Cys 
1130 1135 1140 ' 



Gly Phe Phe Ser Cys Phe Ala Ser Ser He Arg Gly Cys Cys Glu * 
1145 1150 . 1155 



Ser Thr Lys Leu Pro Tyr Tyr Asp Val Glu Lys He ■ His He 6ln 
1160 1165 1170 



<210:^* 53 . . ' 

<211> 1164 

<212> PRT 

<213>. Avian infectious bronchitis virus 

* 

<40D> 53 

liet Leu Gly Lys Ser Leu Phe Leu Val Thr He Leu Cys Ala Leu Cys 
1 .5 * 10 15 . 



Ser Ala Asn Leu Phe Asp Pro Ala Asn Tyr Val Tyr Tyr Tyr Gin Ser 

20 25 30 



Ala Phe Arg Pro Ser Asn Gly Trp His Leu Gin Gly Gly Ala Tyr Ala 
35 40 45 ' ' 



Val Val Asn Ser Ser Asn Tyr Ala Asn Asn Ala Gly Ser Ala Ser Glu 

50 55 • 60 ' . 



Cys Thr Val Gly Val He Lys Asp Val Tyr Asn Gin Ser Ala Ala Ser 
65 70 75 80 



He Ala Met Thr Ala Pro Leu Gin Gly Met Ala Trp Ser iiys Ser Gin 

85 90 95 
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Phe Cys Ser Ala His Cys Afep Phe Ser Glu He Thr Val Phe Val Thr 

100 - • . . ■ 105 • . • . 110 



Hi's Cys Tyr Ser Sep Gly Ser Gly Ser Cys Pro lie Thr Gly Met He 

115 120 125 

. . I • • • , 

< 

Ala Airg Gly His He Arg He Ser Ala Met Lys Asn Gly Ser Leu Phe 
130 135 140 



Tyr Asn Leu Thr Val Ser Val Ser Lys Tyr Pro Asn Phe Lys Ser Phe 
145 .150 . 155 * • 160 



Gin cys Val Asn Asn Phe Thr Ser Val Tyr Leu Asn Gly Asp Leu Val 

165 170 175 



Phe Thr Ser Asq- Lys. Thr Thr Asp Val Thr Ser Ala Gly Val Tyr Phe 

180 185 190 



Lys Ala Gly Gly Pro Val Asn Tyr Ser He Met Lys Glu Phe Lys Val 
195 200 205 



Leu Ala Tyr Phe Val Asn Gly Thr Ala Gin Asp Val He Leu Cys Asp 
210, ■ •■ ' 215 » 220 



Asn Ser Pro Lys Gly Leu Leu Ala Cys Gin Tyr Asn Thr Gly Asn Phe 
225 • 230 • 235 . 240 



Ser Asp Gly Phe Tyr Pro Phe Thr. Asn Ser Thr Leu Val Arg Glu Lys 

245 . ■ 250 255 



Phe He Val Tyr Arg Glu' Ser Ser Val Asn Thr Thr Leu Ala Leu Thr 

260 265 270 



Asn Phe Thr Phe Thr Asn Val Ser Asn Ala Gin Pro 'Asn Ser Gly Gly 
275 . 280 285 



Val His Thr Phe His Leu Tyr (31n Thr Gin Thr Ala Gin Ser Gly Tyr 
290 295 300 



Tyr Asn Phe Asn Leu Ser Phe Leu Ser Gin Phe Val Tyr Lys Ala' Ser 
305 310 315 320 



Asp Tyr Met Tyr Gly Ser Tyr His Pro He Cys Ala Phe Arg Pro Glu 

325 330 335 
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Thr lie Asn Ser Gly Leu Trp Phe ^sn Ser Leu Ser Val Ser Leu Thr 

340 . ' 345 - . • ' 350 



Tyr Gly Pro Leu Gin Gly Gly Tyr Lys Gin Ser Val Phe„Ser Gly Lys 
355 360 365 



Ala Thr Cys Cys Tyr Ala -Tyr Ser Tyr Asn Gly Pro Arg Alal Cys Lys 
370 375 380 



Gly Val Tyr Ser Gly Glu Leu Ser Arg Asp Phe Glu Cys Gly Leu. lieu 
395 • .390 395 400 



Val Tyr Val Thr Lys Ser Asp Gly Ser Arg lie Gin Thr Arg Thr Glu 

405 410 415 



Pro Leu Val Leu Thr Gin His Asn Tyr Asn Asn lie Thr Leu Asp Lys 

420 425 430 • 



Cys Val Ala Tyr Asn He Tyr Gily Arg Val Gly Gin Gly Phe He Thr 
435 ' 440 - 445 • 



Asn Val Thr Asp Ser Val Ala Asn Phe Bex Tyr Leu Ala Asp Gly Gly 
450 * 455 . 460 



* 4, 
\ 



Leu Ala He Leu Asp Thr Ser Gly TVla He Asp Val Phe Val Val Gin. 
465 470 475 480 



Gly Ser Tyr Gly Leu Asn Tyr Tyr Lys Val Asn Pro Cys Glu Asp Val 

485 490 495 



Asn Gin Gin Phe Val Val Ser Gly Gly Asn lie Val Gly He Leu Thr 

500 505 510 



Ser . Arg Asn Glu Thr Gly Ser Glu Gin Val Glu Asn Gin Phe Tyr Val 
515 520 525 



Lys Leu Thr Asn Ser Ser His Arg Arg Arg Arg Ser He Gly Gin TVsn 
530 535 540 



Val Thr Ser Cys Pro Tyr Val Ser Tyr Gly Arg Phe Cys He Glu Pro 
545 . 550 555 560 



*Asp Gly Ser Leu Lys Met He Val Pro Glu Glu Leu. Lys Gin Phe Val 

565 570 575 



Ala Pro Leu Leu Asn He Thr Glu Ser Val Leu He Pro Asn Ser Phe 
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' . • ' 

'580 585 590 • * 



Asn Leu Thr Val. Thr Asp Glu Tyr lie Gin Thr Arg Met Asp Lys Val 
595- . 600 605 



Gin lie Asn C^fe' Leu Gin Tyr Val Cys Gly Asn Ser Leu Glu Cys Arg 
610 615 620 



Lys Leu Phe Gin Gin Tyr Gly Pro Val Cys Asp Asn lie Leu Ser Val 
625 630 635 . 640 



Val Asn Ser Val Ser Gin Lys Glu Asp Met Glu Leu Leu . Ser Phe Tyr 

645 650 655 



Ser Ser Thr Lys Pro Lys* Gly Tyr Asp Thr Pro Val Leu Ser Asn Val 

660 665 670 



Ser Thr Gly Glu Phe Asn He Ser Leu Leu Leu Thr Pro Pro Ser Ser 
675 680' 685 



Pro Ser Gly Arg Ser Phe Val Glu Asp Leu Leu Phe Thr Ser Val Glu 
690 695 • 700 

» 

• • • • 

• » 

Thr Val Gly Leu Pro Thr Asp Ala Glu Tyr Lys Lys ^Cys Thr Ala Gly 
705 .710 ' • 715 720 



Pro Leu Gly Thr Leu Lys Asp Leu He Cys Ala Arg Glu Tyr Asn Gly 

• 725 730 735 



Leu Leu Val Leu Pro Pro He He Thr Ala Asp Met Gin Thr Met Tyr 

740 745 750 



Thr Ala Ser Leu' Val Gly Ala Met Ala Phe Gly Gly He Thr Ser Ala 
755 760 765 



Ala Ala He Pro Phe Ala Thr Gin He Gin Ala Arg He Asn His Leu 
. 770 775 780 



Gly He Ala Gin Ser Leu Leu Met Lys Asn Gin Glu Lys He Ala Ala 
785 790 795 800 



Ser Phe Asn Lys Ala He Gly His Met Gin Glu Gly Phe Arg Ser Thr 

805 810 815 



Ser Leu Ala Leu Gin Gin Val Gin Asp Val Val Asn Lys Gin Ser Ala 

820 825 830 
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He Leu Thr Glu Thr Met Asn Ser Leu Asn Xys Asn .Phe Gly Ala He 
. 835 840 • 845 * 



Ser Ser Val lie CUn Asp He Tyr Ala Gin Leu Asp Ala He Gin Ala 
850 '855 860 

I 

Asp Ala Gin Val Asp Arg Leu He Thr Gly Arg Leu Ser Ser Leu Ser 
865 870 875 . 830 



Val Leu Ala Ser Ala Lys Gin Ser Glu Tyr Jle Arg Val Ser Gin Gin 

885 890 '895 



Arg Glu Leu Ala Thr Gin Lys He Asn Glu Cys Val Lys Ser Gin Ser 

900 * 905 910 



Asn Arg Tyr Gly Phe Cys Gly Ser Gly Arg His Val Leu Ser He Pro 
915 920 925 



Gin Asn Ala Pro Asn Gly He Val Phe He His Phe Thr Tyr Thr Pro 
930 • 935 940 



Glu Thr Phe Val Asn Val Thr Ala He Val Gly Phe Cys Val Asn Pro 

945 950 955 ■ • ■ 960 



Leu Asn Ala Ser Gin Tyr Ala He Val Pro Ala Asn Gly Arg Gly lie 

965 ,970 975 



Phe He Gin Val Asn Gly Thr Tyr Tyr He Thr Ser Arg Asp Met Tyr 

980 985 990 



Met Pro Arg Asp He Thr Ala Gly Asp He Val Thr Leu Thr Ser Cya 
995 1000 10X)5 • 



Gin Ala Asn Tyr Val Asn Val Asn Lys T.hr Val "He Thr Thr Phe 
1010 1015 1020 



Val Glu Asp Asp Asp Phe Asn Phe Asp Asp Glu Leu Ser Lys Trp 
1025 1030 1035 



Trp Asn Asp Thr Lys His Gly Leu Pro Asp Phe Asp Asp Phe Asn 
1040 1045 1050 



Tyr Thr Val Pro He Leu Asn He Ser Gly Glu He Asp Asn He 
1055 1060 1065 
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Gin Gly Val He Gin Gly* Leu Asn Asp Ser Leu He Asn Leu Glu 
1070 ' . 1075 1080 



Glu Leu Sex He He Lys Thr Tyr He Lys Trp Pro Trp Tyr Val 
1085 . ' 1090 1095 

Trp Leu Ala He Gly Phe .Ala He He He 'Phe He Leu He Leu 
1100 1105 1110 



Gly Trp Val Phe Phe Met Thr Gly Cys Cys Gly Cys Cys Cys Gly 
1115 1120 1125 



Cys Phe Gly He He Pro Leu He Ser Lys Cys Gly Lys Lys Ser 
1130 1135 1140 



Ser Tyr Tyr Thr Thr Phe Asp Asn Asp Val Val Thr Glu' Gin Tyr 
1145 ' 1150 1155 



Arg Pro Lys Lys Ser Val 
1160 . • 



<210> 54 

<211> 1363 

<212>' PRT 

<213> Bovine cor onoa virus 

4 

P 

<400>. 54 . 

Met Phe Leu He Leu Leu He Ser Leu Pro Met Ala Phe Ala Val He 
1 • . 5 ■ 10 15 



Gly Asp Leu Lys Cys Thr Thr Val Ser He Asn Asp Val Asp Thr Gly 

20 -25 30 



Ala Pro Ser He Ser Thr Asp He Val Asp Val Thr Asn Gly Leu Gly 
35 40 45 . 



Thr Tyr Tyr Val Leu Asp Arg Val Tyr Leu Asn Thr Thr Leu Leu Leu 
50 • 55 60 



Asn Gly Tyr Tyr Pro Thr Ser Gly Ser Thr Tyr Arg Asn Met Ala Leu 
65 70 75 80 



Lys Gly Thr Leu Leu Leu Ser Arg Leu Trp Phe Lys Pro Pro Phe Leu 

85 90 95 



Ser Asp Phe He Asn Gly He Phe Ala Lys Val Lys Asn Thr Lys Val 



1 
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' 100 105 110 



lie Lys Lys Gly Val Met Tyr Ser Glu Phe Pro ZVla lie Thr lie Gly 
115. 120 125 



Ser Thr Phe Val Ash Thr. Ser Tyr Ser V^l Val Val Gin Pro His Thr 
130 135 140 



Thr Asn Leu Asp Asn Lys Leu Gin Gly Leu Leu Glu lie Ser Val Cya 
145 150 155 160 



Gin Tyr Thr Met Cys Glu Tyr Pro* His Thr He Cys His, Pro- Lys Leu 

165 . 170 175 



Gly Asn Lys Arg Val Glu* Leu Trp His Trp Asp Thr Gly Val Val Ser 

180 185 190 



« 



Cys Leu Tyr Lys Arg Asn Phe Thr Tyr Asp Val Asn Ala Asp Tyr Leu 
195 200 205 



Tyr Phe His Phe Tyr Glh Glu Gly Gly Thr Phe Tyr Ala Tyr Phe Thr 
210 215 220 

• • • 

Asp Thr Gly Val Val Thr Lys Phe Leu Phe Asn Val "Tyr Leu Gly Thr 
225 230 • 235 240 



Val Leu Ser His Tyr Tyr Val Leu Pro Leu Thr Cys Ser Ser Ala Met 

■ 245 250 255 



Thr Leu Glu Tyr Trp Val Thr Pro Leu Thr Ser Lys Gin Tyr Leu Leu 

260 265 270 



Ala Phe Asn Gin Asp Gly Val lie Phe Asn Ala Val Asp Cys- Lys Ser 
275 280 285 



Asp Phe Met Ser Glu He Lys Cys Lys Thr Leu Ser He Ala Pro Ser 
290 295 300 



Thr Gly Val Tyr Glu Leu Asn Gly Tyr Thr Val Gin Pro He Ala Asp 
305 310 315 320 



Val Tyr Arg Arg He Pro Asn Leu Pro Asp Cys Asn He Glu Ala Trp 

325 330 335 



Leu Asn Asp Lys Ser Val Pro Ser Pro Leu Asn Trp Glu Arg Lys Thr 

340 345 350 
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Phe Ser Asn Cys Asn Ph.e Asn Met S.er Ser Leu Met Ser Phe Tie Gin 
355 360 ' 365 ' 



Ala Asp Ser Phe Thr Cys Ash Asn lie Asp Ala Ala Lys lie Tyr Gly 
3'70 !'•' 375 380 



Met Cys Phe Ser Ser He Thr He Asp Lys Phe Ala He Pro Asn Gly 
385 390 .395 * 400 



Arg Lys Val - Asp Leu Gin Leu Gly Asn Leu Gly Tyr. Leu Gin Ser Phe 

405 410 .415 



Asn Tyr Arg lie Asp Thr Thr Ala Thr Ser Cys Gin Leu Tyr Tyr Asn 

420 425 430 



Leu Pro Ala Ala' Asn Val Ser Val Ser Arg Phe Asn Pro Ser Thr Trp 
435 * 440 • • 445 . ' 



. Asn Arg Arg • Phe Gly Phe Thr Glja Gin Phe Val Phe Lys Pro Gin Pro . 
450 ' 455 460 



Val Gly Val Phe Thr His' His. Asp Val Val Tyr Ala Gin His Cys Phe 
465 V* 410* ' ' 475. 480 



Lys Ala Pro Lys Asn Phe Cys Pro Cys Lys Leu Asp Gly Ser Leu Cys 

485. 490 495 



Yal Gly Asn Gly Pr6 Gly He Asp Ala Gly Tyr Lys Asn Ser Gly He 

500 505 510 



Gly Thr Cys Pro Ala Gly Thr Asn Tyr Leu Thr Cys His Asn Ala Ala 
515 520 525 



Gin Cys Asp Cys Leu Cys Thr ' Pro Asp Pro He Thr Ser Lys Ser ;Thr 
530 -535 540 • 



Gly Pro Tyr Lys Cys Pro Gin Thr Lys Tyr Leu Val Gly He Gly Glu 
545 550 555 560 



His Cys Ser Gly Leu Ala He Lys Ser Asp Tyr Cys Gly Gly Asn Pro 

565 570 575 



Cys Thr Cys Gin Pro Gin Ala Phe Leu Gly Trp Ser Val Asp Ser Cys 

580 585 -590 
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Leu Gin Gl.y Asp. Arg Cys Asn lie Fhe Ala Asn Phe lie Phe His Asp 
595 600 ' .605 



Val Asn- Ser Gly Thr Thr Cys Ser Thr Asp Leu Gin Lys ' S,er Asn Thr 
610 / 615 * 620 

• I 

Asp lie lie Leu Gly Val Cys Val Asn Tyr Asp Leu Tyr Gly lie Thr 
625 630 - 635 640 



Gly Gin Gly lie Phe Val Glu Val Asn Ala Thr Tyr Tyr Asn Ser Trp 

645 -650 . 655 



Gin A^n Leu Leu Tyr Asp Ser Asn Gly Asn Leu Tyr Gly Phe Arg Asp 

660 665 670 



Tyr Leu Thr Asn Arg Thr Phe Met He Arg Ser Cys Tyr Ser Gly Arg 
67.5 680 685 



Val Ser Ala Ala Phe His Ala Asn Ser Ser Glu Pro Ala Leu Leu Phe 
690 695 700 



Arg Asn He Lys Cy5 Asn Tyr Veil Phe Asn Asn Thr Leu Ser Arg Gin 
705 • • ' 710- I 715 720 



Leu. Gin Pro He Asn Tyr Phe Asp Ser Tyr Leu Gly Cys Val Val Asn 

725 730 735 



Ala Asp Asn Ser Thr Ser Ser Val. Val Gin Thr Cys Asp Leu Thr Val 

740 745 750 



• Gly Ser Gly Tyr Cys Val Asp Tyr Ser Thr Lys Arg Arg Ser Arg Arg 
755 760 765 



Ala He Thr Thr Gly Tyr Arg Phe Thr Asn* Phe Glu Pro Phe Thr Val 
770 775 780 



Asn Ser Val Asn Asp Ser Leu Glu Pro Val Gly Gly Leu Tyr Glu He 
785 790 795 800 



Gin He Pro Ser Glu Phe Thr He Gly Asn Met Glu Glu Phe He Gin 

80.5 810 815 



Thr Ser Ser Pro Lys Val Thr He Asp Cys Ser Ala Phe Val Cys Gly 

820 825 830 
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Asp Tyr Ala Ala Cys Lys Ser Gin Leu Val Glu Tyr Gly Ser Phe Cys 
• 835 * * 840 • ' - * 845 * 



Asp Asn He Asn Ala He Leu Thr Glu Val Asn Glu Leu '..Leu Asp Thr 
850 855 860 



1 • 



Thr Gin Leu Gin Val Ala Asn Ser Leu Met Asn .Gly Val Thr Leu Ser 
865 870 \ .875 . 880 



Thr Lys Leu Lys Asp Gly Val Asn Phe Asn Val Asp* Asp He Asn.^he 

885. 890 8S5 



Ser Pro Val Leu Gly Cys Leu Gly Ser Ala Cys Asn Lys Val Ser Ser 

900 905 910 



Arg Ser Ala He Glu Asp Leu Leu Phe Ser Lys Val Lys Leu Ser Asp 
" 915 , 920 925 ■ . 



Val Gly Phe Val Glu Ala' Tyr Asn Asn Cys Thr Gly Gly Ala Glu He 
930 935 940 



Arg Asp Leu He Cys Val Gin Ser Tyr Asn Gly He Lys Val Leu Pro • 
945 .950 -955 960 . • 



Pro Leu Leu Ser Val Asn Gin He Ser Gly- Tyr Thr Leu Ala Ala Thr. 

965 970 • 975 



Ser Ala Ser Leu Phe Pro Pro Leu Ser Ala Ala Val Gly Val Pro Phe 

980 985 990 



Tyr Leu Asn Val Gin Tyr Arg He Asn Gly He Gly Val- Thr Met Asp; 
995 • 1000 lOOS 



Val Leu Ser Gin- Asn Gin Lys Leu He Ala Asn Ala Phe Asn Asn 
1010 1015 1020 



Ala Leu Asp Ala He Gin Glu Gly Phe Asp Ala Thr Asn Ser Ala 
1025 1030 ' 1035 



Leu Val Lys He Gin Ala Val Val Asn Ala Asn Ala Glu Ala Leu 
1040 1045 . 1050 



Asn Asn Leu Leu Gin Gin Leu Ser Asn Arg Phe Gly Ala He Ser 
1055 1060 1065 



Ser Ser Leu Gin Glu He Leu Ser Arg Leu Asp Ala Leu Glu Ala 
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• ■ 

1070 1075 1080 



Gin Ala Gin lie Asp Arg Leu lie Asn Gly Arg Leu Thr Ala Leu- 
1085 1090 1095 



Asn Val Tyr Val Bex Gin Gin Leu Ser Asp Ser Thr Leu Val Lys 
1100 1105 . 1110 j 

> » « • 

Phe Ser Ala Ala Gin Ala Met Glu Lys VaJ Asn Glu Cys Val Lys 
. . 1115 1120 1125 



Ser Gin Ser Ser Arg lie Asn Phe Cys Gly Asn Gly Asn His lie 
1130 1135 1140 



lie Ser Leu Val Gin Asn Ala- Pro Tyr Gly Leu Tyr Phe He His 
1145 1150 1155 



Phe Ser -Tyr Val Pro Thr Lys Tyr Val Thr Ala Lys Val Ser Pro 
1160 1165 '* 1170 . • 



Gly Leu Cys He Al^i Gly Asp • Arg Gly He Ala Pro • Lys Ser Sly 

■ 1175 1180 1185 • ■ 



Tyr ?he ^ Val Asn Val Asn Asn Thr Trp Met Phe Thr Gly Ser Gly 
1190 . 1195 1200 



Tyr Tyr Tyr Pro Glu Pro He Thr Gly Asn Asn Val Val Val Met 
1205 1210 1215 



Ser Thr Cys Ala Val Asn Tyr Thr Lys Ala Pro Asp Val Met Leu 
1220 1225. 1230 



Asn He Ser Thr Pro Asn Leu His Asp Phe Lys Glu Glu Leu Asp 
■1235 1240 1245 



Glh Tfp Phe Lys Asn Gin Thr Ser Val Ala Pro Asp ' Leu Ser Leu ' 
1250 1255 1260 



Asp Tyr . He Asn Val Thr Phe Leu Asp Leu Gin Asp Glu Met Asn 
1265 1270 1275 



Arg Leu Gin Glu Ala He Lys Val Leu Asn Gin Ser Tyr He Asn 
1280 1285 1290 



Leu Lys Asp He Gly Thr Tyr Glu Tyr Tyr Val Lys Trp Pro Trp 
1295 1300 1305 
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» 

Tyr Val Trp Leu Leu He Gly Phe Ala Gly Val Ala Met Leu Val 
1310 , . 1315 i320 



Leu Leu Phe PHe .lie Cys Cys Cys Thr Gly Cys Gly Thr Ser Cys 
1325 I"' . 1330 1335 



• I 



Phe Lys He Cys Gly Gly Cys Cys Asp Asp Tyr "Thr Gly His Gin 
1340 1345 1350 



Glu Leu Val He Lys Thr Ser His Asp Asp 
1355 1360 



<210> 55 

<211> 1453 

<212> PRT . • 

<213> canine coronavirus 

<400> 55 

Met lie Val Leu He Leu Cys Leu Leu Leu Phe Ser Tyr Asn Ser Val 
1 5 10 15 



He- Cys Thr Ser Asn Asn Asp Cys Val Gin Gly Asn Val Thr .Gin Leu 

20 - 25 * '30 



Pro- Gly Asn Glu Asn He He Lys Asp Phe Leu Phe His Thr Phe Lys 
•35 . 40 45 



Glu Glu Pro Ser Val Val Val Gly. Gly Tyr Tyr Pro Thr Glu Val Trp 
50 55 60 



Tyr Asn Cys Ser Arg Ser Ala Thr Thr Thr Ala Tyr Lys Asp Phe Ser 
65 70 75 80 



Asn He His* Ala Phe Tyr Phe Asp Met Glu Ala Met Glu Asn Ser Thr 

85 90 95 



Gly Asn Ala Arg Gly Lys Pro Leu Leu Val His Val His Gly Asp Pro 

100 105 110 



Val Ser He He He Tyr He Ser Ala Tyr Arg Asp Asp Val Gin Pro 
115 120 125 



Arg Pro Leu Leu Lys His Gly Leu Leu Cys He Thr Lys Asn Lys He 
130 135 140 
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■ 

« 

lie Asp Tyr Asn Thr Phe Thr Ser Ala Gin Trp Ser Ala lie Cys Leu 
145 • , 150 ' • 155.' ■ 160 . 

ft 

61y Asp Asp Arg Lys He Pro Phe Ser "Val He Pro Thr- >sp Asn Gly 

165 170 . 175 

■ 

Thr Lys He Phe Gly Leu Glu Trp Asn Asp Asp Tyr Val Thr.jAla Tyr* 

■ 180 183 ■ . 190 

^ ♦ 

Jle Ser Asp Arg Ser His His Leu Asn' lie Asn Asn' Asn Trp Phe. Asn 
195 200 205 



Asn Val Thr lie Leu Tyr Ser Arg Ser Ser Ser Ala Thr Trp Glri Lys 
210 215 220 



Ser Ala Ala Tyr Val Tyr Gin Gly Val Ser Asn Phe Thr Tyr Tyr Lys 
225 230 .235 240 



■ Leu Asn Asn Thr Asn Gly Leu liys Ser Tyr Glu Leu Cys Glu Asp Tyr 

245 250 255". 



Glu Tyr Cys Thr Gly Tyr Ala Thr Asn Val Phe Ala Pro Thr Val Gly • 

260' 265 270 

Gly Tyr lie Pro His Gly Phe Ser Phe Asn- Asn Trp Phe Met Arg Thr 
275 280 285 



Asn Ser Ser Thr Phe Val Ser Gly Arg Phe Val Thr Asn Gin Pro Leu 
290 295 300 



Leu Val Asn Cys Leu Trp Pro Val Pro Ser Phe Gly Val Ala Ala Gin 
305 310 315 .-320 



Gin Phe Cys Phe Glu Gly Ala Gin Phe Ser Gin Cys Asn Gly Val Ser 

325 330 335 



Leu Asn Asn Thr Val Asp Val He Arg Phe Asn Leu Asn Phe Thr Ala 

. 340 345 350 



Leu Val Gin Ser Gly Met Gly Ala Thr Val Phe Ser Leu Asn Thr Thr 
355 360. 365 



Gly Gly Val He Leu Glu He Ser Cys Tyr Asn Asp Thr Val Ser Glu 
370 375 380 



Ser Ser Phe Tyr Ser Tyr Gly Glu He Ser Phe Gly Val Thr Asp Gly 
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■ 

385 390 .395 . 400 



Pro Arg Tyr Cys Phe Ala Leu Tyr Asn Gly Thr Ala Leu Lys Tyr Leu 

405 .410 415 



Gly Thr Leu Pro 'Pro Sea? Val Lys Glu He Ala lie Ser Lys' Trp Gly 

420 425. 430 



His Phe Tyr He Asn Gly Tyr Asn Phe Phe Ser Thr Phe Pro He Asp 
435 440 445 



Cys He Ser Phe Asn Leu Thr Thr Gly Asp Ser Gly Ala Phe Trp Thr 
450 455 ■• 460 



lie Ala Tyr Thr Ser Tyr Thr Asp Ala Leu Val Gin Val Glu Asn Thr 
465 470 475 480. 



Ala He Lys Lys Val Thr Tyr Cys Asn Ser His He Asn Aisn He Lys 

485 490 495 



Cys Ser Glri Leu Thr Ala Asn Leu Gin Asn Gly Phe Tyr Pro Val Ala 

500 505 510 

■ * 

Ser Ser Glu Val Gly Leu Val Asn Lys Ser Val Val Leu Leu Pro Ser 
515 • 520 525 



Phe Tyr Ser His Th.r Ser Val Asn He Thr He Asp Leu Gly Met Lys 
530 535 •540 



Arg Ser Gly Tyr Gly Gin Pro He Ala Ser Thr Leu Ser Asn He Thr 
545 550 555 560 



Leu Pro Met. Gin' Asp Asn Asn Thr Asp Val Tyr Cys He Arg Ser Asn 

565 570 575 



Arg Phe Ser Val Tyr Phe His Ser Thr Cys Lys Ser Ser Leu Trp Asp 

580 . 585 590 



Asp Val Phe Asn Ser Asp Cys Thr Asp Val Leu Tyr Ala Thr Ala Val 
595 600 605 



He Lys Thr Gly Thr Cys Pro Phe Ser Phe Asp Lys Leu Asn Asn Tyr 
610 615 620 



Leu Thr Phe Asn Lys Phe Cys Leu Ser Leu Asn Pro Val Gly Ala Asn 
625 630 635 640 
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Cys Lys Pbe Asp Val. Ala Ala Arg Thr Arg Thr Asn iGlu Gin Val Val 

. 645 " ■ 650 655 



Arg Ser Leu Tyr Val lie Tyr Glu Glu Gly Asp Asn lie Val Gly Val 

660 ' 665 670 ' 

I 

Pro Ser Asp Asn Ser Gly Leu His Asp Leu Ser Val Leu His Leu Asp 
.675 680 " • • 685 



Ser Cys Thr Asp Tyr Asn He Tyr Gly He Thr Gly. Val Gly He He 
690 695 700 



Arg Gin Thr Asn Ser Thr Leu Leu Ser Gly Leu Tyr Tyr Thr Ser Leu 
705 710. 716 720 



Ser Gly Asp Leu Leu Gly Phe Lys Asn Val Ser Asp Gly Val He Tyr 

725 -730 735 



Ser Val Thr Pro Cys. Asp Val Sep Ala His Ala .Ala Val He Asp Gly 

740 745 • . 750 



Ala He Val Gly Ala Met Thr Ser He Asn Sex Glu Leu Leu Gly Leu 
V 755 . 760 765 



Thr His Trp Thr Thr Thr Pro Asn Phe Tyr Tyr Tyr Ser lie Tyr Ash 
770 775 780 



Tyr Thr Asn Glu Arg Thr Arg Gly Thr Ala He Asp Ser Asn Asp Val- 
785 790 795 • 800 



Asp Cys Glu Pro He He. Thr Tyr Ser Asn He Gly Val Cys Lys Asn 

805 810 815 



,Gly Ala Leu Val Phe He Asn Val Thr His Ser Asp Gly Asp Val Gin 

820 . 825 ■ 830 



Pro He Ser Thr Gly Asn Val Thr He Pro Thr Asn Phe Thr He Ser 
835 ■ 840 845 



Val Gin val Glu Tyr lie Gin Val Tyr Thr Thr Pro Val Ser He Asp ' 
850 855 860 



Cys Ser Arg Tyr Val Cys Asn Gly Asn Pro Arg Cys Asn Lys Leu Leu 
865 870 875 880 
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Thr Gin Tyr Val Ser Ala Cys Gin Thr lie Glu Gin Alai Leu Ala Met 

865 ' 890 ■ . .895 • 



Gly Ala-Arg- Leu Glu Asn Met Glu lie Asp Ser Met Leu The Val Ser" 

900 . 905 910 

« 

* 

Glu Asn Ala Leu Lys Leu Ala Ser Val Glu Ala 'phe Asn Ser Thr Glu 
915 920 925 



Thr Leu Asp Pro lie Tyr Lys Glu Trp Pro Asn lie Gly Gly 5er Trp 
• 930 935 940 



Leu Gly Gly Leu Lys Asp, lie* Leu Pro Ser His Asn Ser Lys Arg Lys 
945 950 955 960 



Tyr Arg Ser Ala lie Glu Asp Leu Leu Phe Asp Lys Val Val Thr Ser 



965 970 975 



Gly Leu Gly Thr Val Asp Glu Asp Tyr Lys Arg Cys Thr Gly Gly Tyr 

980 985 ' 990 ' ' 



Asp lie Ala Asp Leu Val Cys Ala Gin Tyr Tyr- Asn Gly lie Met- Val 
• 995' 1000 I 1005 



Leu Pro Gly Val Ala Asn Asp Asp Lys Met Ala Met Tyr Thr Ala 
1010' . 1015 1020 * 



Ser Leu Ala Gly Gly He Thr Leu Gly Ser Leu Gly Gly Gly Ala 
1025 1030' • 1035. 



Val Ser He Pro Phe Ala He Ala Val Gin Ala Arg Leu Asn Tyr 
1040 1045 . 1050 



Val Ala Leu Gin Thr Asp Val Leu Asn Lys Asn Gin Gin He Leu 
1055 1060 1065 



Ala Asn Ala Phe Asn Gin Ala He Gly Asn He Thr Gin Ala Phe 
1070 1075 1080 



Gly Lys Val Asn Asp Ala He His Gin Thr Ser Gin Gly Leu Ala 
1085 1090 1095 



Thr Val Ala Lys Val Leu Ala Lys Val Gin Asp Val Val Asn Thr 
1100 1105 1110 
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• • • 

Gin Gly Gin Ala Leu Ser His Leu Thr Leu Glii .Leu Gin Asn Asn 
1115 1120 ..1125' 

Phe Gin Ala He Ser Ser Ser .He Ser Asp He Tyr Asn Arg Leu 
1130 .• 1135"^ 1140 

* 

Asp Glu Leu Ser Ala Asp Ala Gin Val Asp Arg Leu He jPhr Gly 
1145 . 1150 1155 ' 

• • • 

■ 

Arg Leu Thr Ala Leu Asn Ala Phe Val Ser Gl.n Thr Leu Thr Arg 
1160 1165 1170 

Gin Ala Glu Val Arg Ala Ser Arg Gin Leu Ala Lys Asp Lys Val 
1175 . 1180 1185' 



Asn Glu Cys Val Arg Ser Gin Ser. Gin Arg Phe Gly Phe Cys Gly 
1190 . 1195 • 1200 



Asn Gly Thr His Leu Phe Ser Leu Ala Asn Ala Ala Pro Asn Gly 
1205 1210 1215 



Met He Phe Phe His Thr Val Leu Leu Pro Thr Ala Tyr Glu Tha: 
1220 1225 1230 

• • • I ' 

* * 

Val Thr Ala Trp. Ser Gly He Cys Ala Ser Asp Gly Asp Arg Thr 
•1235 • 1240 1245 



Phe Gly Leu Val Val Lys Asp Val Glii Leu Thr Leu Phe Arg Asn 
1250 1255 1260 



Leu Asp Asp Lys Phe Tyr Leu Thr Pro Arg Thr Met Tyr Gin Pro 
1265 1270 1275 



He Val Ala Thr Ser Ser Asp Phe Val Gin He Glu Gly Cys Asp 
1280 1285 1290' 



Val Leu Phe Val Asn Ala Thr Val He Asp Leu Pro Ser He He 
1295 1300 1305 



Pro Asp Tyr He Asp He Asn Gin Thr Val Gin Asp He Leu Glu 
1310 1315 1320 



Asn Phe Arg Pro Asn Trp Thr Val Pro Glu Leu Pro Leu Asp He 
1'325 1330 1335 



Phe Asn Ala Thr Tyr Leu Asn Leu Thr Gly Glu He Asn Asp Leu 
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1340 1345 1350 * ■ 

♦ , ■ • 

Glu Phe Arg Ser Glu Lys Leu His Asn Thr Thr Val Glu Leu Ala 

1355 1360 1365 

He Leu lie A'sp Asn He Asn Asn Thr Leu Val Aan Leu Glu Trp' 

137D 1375 1380 



Leu Asn . Arg He Glu Thr Tyr Val Lys Trp Pro Trp Tyr Val Trp 
1385 1390 i395 



Leu Leu He Gly Leu Val Val He Phe Cys He Pro He Leu Leu 
1400 1405 - 1410 



Phe Cys Cys Cys Ser Thr Gly Cys Cys Gly Cys He Gly Cys Leu 
1415 1420 1425 



Gly Ser Cys' Cys His Ser He Cys Ser Arg Arg Gin Phe Glu Ser ' 
1430 1435 1440 



Tyr Glu Pro He Glu Lys Val His Val His 
1445 1450 



<210> 56 
<211> 1464 
<212> PRT 

<213> Feline Infectious- peritonitis virus 

■I 

<400> 56 

* • 

t^et He Phe He He Leu Thr Leu Leu Ser Val Ala Lys Ser Glu Asp- 
1 5 10 15 . 



Ala Pro His Gly Val Thr. Leu Pro Gin Phe Asn Thr Ser His Asn Asn 

20 25 30 



Glu Arg Phe Glu Leu Asn Phe Tyr Asn Phe Leu Gin Thr Trp Asp He 
35 40 45 



Pro Pro Asn Thr Glu Thr He Leu Gly Gly Tyr Leu Pro Tyr Cys Gly 
• 50 • 55 • 60 



Ala Gly Val Asn Cys Gly Trp Tyr Asn Phe Ser Gin Ser Val Gly Gin 
65 70 75 80 



Asn Gly Lys Tyr Ala Tyr He Asn Thr Gin Asn Leu Asn He Pro Asn 

85 90 95 
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Val His Gly Val Tyr Phe Asp Val Arg Glu His Asn ASii Asp Gly Glu 

100 • . . ' 105 110 



Tip Asp- Asp Arg Asp Lys Val Gly Leu Leu lie Ala lie HiS' Gly Asn 

* 115 , 120 125 

• - I 

I 

Ser Lys Tyr Ser Leu Leu Met Val Leu Gin Asp Ala Val Glu Ala Asn 
130 135 140 



Gin Pro His Val Ala Val Lys He Cys His Trp Lys Pro Gly Asn He 
145 150 ■ .155 160 



Ser Ser Tyr His Ala Phe Ser Val Asn Leu Gly Asp Gly Gly Gin Cys 

I 165 170 175 



Val Phe Asn Gin Arg Phe Ser Leu Asp Tht Val Leu Thr Thr Asn Asp 

' 180 185 190 



Phe Tyr Gly Phe Gin Trp Thr Asp Thr Tyr Val Asp He Tyr Leu Gly 
195 200 205 ^ 



Gly Thr He Thr Lys Val Trp Val. Asp Asn Asp Trp Ser He Val Glu 
210 '215 » 220 



Ala Ser He Ser Tyr His Trp Asn Arg He Asn Tyr Gly Tyr Tyr Met 
225 230 - 235 240 



Gin Phe Val Asn Arg Thr Thr Tyr Tyr Ala Tyr Asn Asn Thr Gly Gly 

245 .250 255 



Ala Asn Tyr Thr Gin Leu Gin Leu Ser Glu Cys His Thr Asp Tyr Cys 

260 265 270 



Ala Gly Tyr Ala Lys Asn Val Phe Val Pro He Asp Gly Lys He Pro 
275 . .280 285 . 



Glu Asp Phe Ser Phe Ser Asn Trp Phe Leu Leu Ser Asp Lys Ser Thr 
290 295 300 



Leu Val Gin Gly Arg Val Leu Ser Ser Gin Pro Val Phe Val Gin Cys 
305 310 315 320 



Leu Arg Pro Val Pro Ser Trp Ser Asn Asn Thr Ala Val Val His Phe 

325 330 335 
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■ ' » ■ • 

Lys Asn Asp Ala Phe Cys Pro Asn Val Thr Ala Asp Val Leu Axg Phe 

340 . ' * 345 • • " 350 



Asn Leu Asn Phe Ser Asp Thr. Asp Val Tyr Thr Asp Ser '.Thr Asn Asp 
355 * 360 365 

Glu Gin Leu Phe Phe Thr Phe Glu Asp Asn Thr .Thr Ala Ser lie Ala 
370 .375 380 



• • • ■ " 

Cys Tyr Ser Ser Ala Asn Val Thr Asp*. Phe Gin Pro- Ala Asn Asn. Ser 
385 • . 390 395 . 400 



♦ 

Val Ser His* lie Pro Phe Gly Lys Thr Ala His Phe Cys Phe Ala Asn 

405 410 415 



Phe Ser His Ser lie Val Ser Arg Gin Phe Leu Gly tie Leu Pro Pro • 

42q . ■ 425 . 430 • . ' * 



Thr Val Arg Glu Phe Ala Phe dly Arg Asp Gly Ser He Phe Val Asn 
435 440 445 



Gly Tyr Lys Tyr Phe Ser Leu Pro Ala -He Arg Ser Val Asn Phe Ser • 
450 455 460 



« ■ 



He Ser Ser Val Glu Glu Tyr Gly Phe Trp- Thr He Ala Tyr Thr Asn. 
465 470 "475 480 



Tyr Thr Asp Val Met Val Asp Val Asn Gly Thr Ala He Thr Arg Leu- 

485 490 495 



Phe Tyr Cys Asp Ser Pro Leu Asn Arg He Lys Cys Gin Gin Leu Lys 

500 505 510 



His Glu Leu Pro Asp Gly Phe Tyr Ser Ala Ser Met Leu Val Lys Lys 
515 520 525 



Asp Leu Pro Lys Thr Phe Val Thr Met Pro Gin Phe Tyr His Trp Met 
530 535 540 



Asn Val Thr Leu His Val Val Leu Ash Asp Thr Glu Lys Lys Tyr Asp 
545 550 555 560 



He He Leu Ala Lys Ala Pro Glu Leu Ala Ala Leu. Ala Asp Val His 

- 565 570 575 



Phe Glu He Ala Gin Ala Asn Gly Ser Val Thr Asn Val Thr Ser "Leu 
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580 585 590 



Cys Val Gin Ala Arg Gin Leu Ala Leu Phe Tyr Lys Tyr Tlir Ser Leu 
595. 600 605 

• * 

it" 

Gin Gly Leu Tyr Thr Tyr Ser Asn Leu Val Glu Leu Gin Asn Tyr Asp' 
610 ' 615 620 • • 



Cys Pro Phe Ser Pro Gin Gin Phe Asn Asn Tyr Leu Gin Phe Glu Thr 
625 • 630 635 . 640 



Leu Cys Phe Asp Val Asn Pro Ala Val Ala Gly Cys Lys Trp Ser Leu 

645 650 655 



Val His Asp Val Gin Trp' Arg Thr Gin Phe Ala Thr lie Thr Val. Ser 

660 665 670 



Tyr Lys His Gly Ser Met He Thr Thr His Ala Lys Gly His Ser Trp 
675 680 685 



Gly Phe Glri Asp Thr Ser Val Leu Val Lys Asp Glu Cys Thr Asp Tyr 
690 695 ; 700 

• • • • 

Asn He Tyr Gly Phe Gin Gly Thr Gly He He Arg Asn Thr Thr' Ser 
705 710 ' ' ■ 715 • 720 



Arg Leu Val Ala Gly Leu Tyr Tyr Thr Ser He Ser Gly Asp Leu Leu 

725 730 735 



Ala Phe Lys Asn Ser Thr Thr Gly Glu He Phe Thr Val Val Pro Cys 

740 745 750 



Asp Leu Thr Ala Gin Val Ala Val He Asn Asp Glu He Val - Gly Ala 
755 760 765 



He Thr Ala Val ^sn Gin Thr Asp Leu Phe Glu Phe Val Asn Asn Thr 
770 775 780 



Gin Ala Arg Arg Ser Arg Ser Ser Thr Pro Asn Ph.e Val Thr Ser Tyr 
785 790 795 800 



Thr Met Pro Gin Phe Tyr Tyr He Thr Lys Trp Asn Asn Asp Thr Ser 

805 810 815 



Ser Asn Cys Thr Ser Ala He Thr Tyr Ser Ser Phe Ala He Cys Asn 

820 825 830 
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Thr Gly Glu lie Lys. Tyr Val Asn Val Thr Jiis Val Glu lie Val Asp 
835 840 845 



Asp Ser He' Gly Val He* Lys Pro Val Ser Thr Gly Asn* lie Ser 'lle 
850 85.5 860 

« ■ * • 

Pr6 Lys Asn Phe Thr Val Ala Val Gin Ala Glu Tyr lie Gin He Gin 

865 870 575 . . 880 



Val Lys Pro Val Val Val Asp Cys Ala Thr Tyr Vai.Cys Asn Gly Asn 

885 890 895 



Thr His Cys Leu Lys Jjeu Leu Thr Gin Tyr Thr Ser Ala Cys Gin Thr 

900 905 910 



He Glu Asn Ala Leu Asn Leu Gly Ala Arg Leu Glu Ser Leu Met Leu 
915 • 920 925 • 



Asn Asp Met He Thr Val Ser Asp Arg Gly Leu Glu Leu Ala Thr Val 
930 ' 935 940 



Glu Arg Phe Asn Ala Thr Ala Leu Gly Gly Glu Lys Leu Gly Gly Leu 
945 k' 950 * • * 955 ' • 960 



Tyr Phe Asp Gly Leu Ser Ser Leu Leu Pro Pro Lys He Gly Lys Arg 

.965 970 975 



5er Ala Val Glu Asp Leu Leu Phe Asn Lys Val Val Thr Ser Gly Leu- 

980 985 990 



Gly Thr Val Asp Asp Asp. Tyr Lys Lys Cys Ser Ser Gly .Thr Asp Val 
' 995 1000 1005 



Ala Asp Leu .Val Cys Ala Gin Tyr Tyr Asn Gly He Met Val Leu 
1010 1015 1020 



Pro Gly Val Val Asp Gly Asn Lys Met Ser Met Tyr Thr Ala Ser 
• 1025 ■ 1030 1035 



Leu He Gly Gly Met Ala Leu Gly Ser He Thr Ser Ala Val Ala 
1040 1045 1050 



Val Pro Phe Ala Met Gin Val Gin Ala Arg Leu Asn Tyr Val Ala 
1055 1060 1065 
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Leu Gin Thr Asp Val Leu' 6ln Glu Asn Gin Lys lie Leu Ala Asn 
1070 ' V 1075 . . .1080 



Ala Phe Asn Asn Ala He Gly Asn He Thr Leu Ala Leu Gly Lys 
1085 ... ' 1090. 1095 ' 



Val Ser Asn Ala He Thr. Thr Thr Ser Asp Gly Phe Asn Ser Met 
1100 1105 1110 



Ala Ser Ala Leu Thr Lys He Gin Ser Val Val Asn Gin Gin Gly 
1115 1120 1125 



Glu Ala Leu Ser Gin Leu Thr Ser Gin Leu Gin Lys Asn Phe Gin 
1130 1135 1140 



Ala He Ser S^r Ser He Ala Glu He Tyr Asn Arg. Leu Glu Lys 
1145 1150 1155 



Val Glu Ala Asp Ala Gin Val Asp Arg Leu He Thr Gly Arg Leu 
1160 1165 1170 



Ala Ala Leu Asn Ala Tyr Val Ser Gin Thr LeU Thr. Gin Tyr Ala 
1175 ' 1180 1185 



Glu Val Lys Ala Ser Arg Gin He Ala Lev4 Glu Lys Val Asn Glu 
1190 • • 1195 1200 



Cys Val Lys Ser Gin Ser Asn Arg Tyr Gly Phe Cys Gly Asn Gly 
1205 * 1210 1215 



.Thr His Leu Phe Ser Leu Val Asn Ser Ala Pro Glu Gly Leu .Leu 
1220 1225 1230 



Phe Phe His Thr Val Leu Leu Pro Thr Glu Trp Glu Glu Val Thr 
1235 ■ 1240 1245 



Ala Trp Ser Gly He Cys Val Ash Asp Thr Tyr Ala Tyr Val Leu 
1250 1255 1260 



Lys Asp Phe Asp His Ser He Phe Ser Tyr Asn Gly Thr Tyr Met 
1265 1270 1275 



Val Thr Pro Arg Asn Met Phe Gin Pro Arg Lys Pro Gin Met Ser 
1280 1285 1290 
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Asp Phe Val Gin lie Thr Ser Cys Glu Val Thr Phe Leu Asn Met 
1295 1300 1305* 



Thr Tyr Thr Thr Phe Gin Glu .lie Val lie Asp Tyr lie Asp lie 
1310 1315' 1320 . . 



Asn Lys Thr lie Ala Asp Met Leu Glu Gin Tyr . Asn Pro fisn Tyr 
1325. 1330 1335 



Thr Thr Pro Glu Leu Asn Leu Leu Leu Asp lie Phe Ash Gin Thr 
1340 1345 1350 



Lys Leu Asn Leu Thr Ala Glu lie Asp Gin Leu Glu Gin Arg Ala 
1355 13€0 1365 



Asp Asn Leu Thr Thr lie Ala His Glu Leu Gin Gin Tyr He Asp 
1370 1375 " 1380 



Asn Leu Asn Lys Thr Leu Val Asp Leu Asp Trp Leu Asn Arg He 
1385 139<i • - 1395 



Glu Thr Tyr Val Lys Trp Pro Trp Tyr Val - Trp Leu Leu He Gly 
1400 1405 . 1410 

• ■ ■ » ■ 

« • 

Leu Val Val Val Phe Cys He Pro Leu Leu Leu Phe Cys Cys Leu 
1415 • 1420 1425 



Ser Thr Gly Phe Cys Gly Cys Phe Gly Cys Val Gly Ser Cys Cys 
1430 1435 1440 



His Ser Leu Cys Ser Arg Arg Gin Phe Glu Thr Tyr Glu Pro He 
1445 1450 1455 



Glu Lys Val His He His 
1460 



<210> 57 

<211> 1235 

<212> PRT 

<213> Mouse hepatitis virus 

<400> 57 

Met Lisu Phe Val Phe He Leu Leu Leu Pro Ser Cys Leu Gly Tyr He 
1 5 10 15 



Gly Asp Phe Arg Cys He Gin Thr Val Asn Tyr Asn Gly Asn Asn Ala 

20 25 30 
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• • • • 



Ser Ala Pro Ser He Sex' Thr Glu Ala Val Asp Val Ser Lys Gly Arg 
35 , . . ' '40 ' 45 



Gly Thr Tyr' Tyr Val lieu' Asp Arg Val Tyr Leu Asn Ala" Thr Leu Leu 
5t) 55 '60 



Leu Thr Gly Tyr Tyr Pro Val Asp Gly Ser Asn Tyr Arg Asn Leu Ala 
65 . 70 " 75 • . 80 



Leu Thr Gly Thr Asn Thr Leu Ser Leu Thr Trp Phe.Lys Pro .Pro Phe 

85 90 35 • 



^ Leu Ser Glu Phe Asn Asp Gly He Phe Ala Lys Val Gin Asn Leu Lys 

100 105 110 



Thr Asn Thr Pro* Thr Gly Ala Thr Ser Tyr Phe Pro Thr He Val He 
115 • . 120 • 125 . 



Gly Ser Leu Phe Gly Asn Thr Ser Tyr Thr Val Val Leu Glu Pro Tyr , 
130 ■ 135- • . 140 • 



Asn Asn He He Met Ala Ser Val Cys Thr Tyr Thr He Cys Gin Leu 
145 • 150 • ■ 155 160 



Pro Tyr Thr Pro Cys Xiys Pro* Asn Thr Asn Gly Asn Arg Val He Gly 

165^ 170 175 



phe Trp His Thr Asp Val Lys Pro Pro He Cys Leu Leu Lys Arg Asn 

180 185 190 



Phe Thr Phe Asn Val Asn Ala Pro Trp Leu Tyr Phe His Phe Tyr Gin 
195 200 205 



Gin Gly Gly Thr Phe Tyr Ala Tyr Tyr Ala Asp Lys Pro Ser Ala Thr- 
210 215 220 



Thr Phe Leu Phe Ser Val Tyr He Gly Asp He Leu Thr Gin Tyr Phe 
225 230 . 235 240 



Val Leu Pro Phe He Cys Thr Pro Thr Ala Gly Ser Thr Leu Ala Pro 

245 250 255 



Leu Tyr Trp Val Thr Pro Leu Leu Lys Arg Gin Tyr Leu Phe Asn Phe 

260 265 270 
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Asn Glu Lys Gly Val He Thr Ser Ala Val Asp Cys Ala Ser Ser Tyr 
275 • 280 • • .285 



He Ser- Glu He Lys Cys Lys Thr Gin Ser Leu Leu Pro Ser, Thr Gly 
290 . 295 300 ■ 

I 

Val .Tyr Asp Leu Ser Gly Tyr Thr Val Gin Pro Val Gly Val Val Tyr 
305 310 315 320 



Arg Arg Val Pro Asn Leu Pro Asp Cys Lys He Glu Glu Trp Leu Thr 

325 330 335 



Ala Lys Ser Val Pro Ser Pro Leu Asn Trp Glu Arg Arg Thr Phe Gin 

340 345 350 



Asn Cys Asn Phe -Asn Leu Ser Ser Leu Leu Arg Tyr Val. Gin Ala Glu 
355 360 365 



Ser Leu Ser Cys Asn Asn He Asp Ala Ser Lys Val Tyr Gly Met Cys 
370 375 380 



Phe Gly Ser Val Ser Val Asp Lys Phe Ala He Pro Arg Ser Arg Gin 
385 .... 390 I 395 400 



He Asp Leu Gin He Gly Asn Ser Gly Phe Leu Gin Thr Ala Asn Tyr 

405 410 .415 



Lys He Asp Thr Ala Ala Thr Ser Cys Gin Leu Tyr Tyr Ser Leu Pro 
. 420 - 425 430 



Lys Asn Asn Val Thr He Asn Asn Tyr Asn Pro Ser Ser Trp Asn Arg 
435 440 445 



Arg Tyr Gly Phe Lys Val Asn Asp Arg Cys Gin He Phe Ala Asn He 
450 455 460 



Leu Leu Asn Gly He Asn Ser Gly Thr Thr Cys Ser Thr Asp Leu Gin 
465 470 475 480 



Leu Pro Asn Thr Glu Val Ala Thr Gly Val Cys Val Arg Tyr Asp Leu 

485 490 495 



Tyr Gly He Thr Gly Gin Gly Val Phe Lys Glu Val Lys Ala Asp Tyr 

500 505 510 
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Tyr Asn Ser Trp Gin Ala Leu Leu Tyr Asp Val Asn Gly Asn Leu Asn 
515 ' M 520 • 525 ' 



Gly Phe Arg Asp Leu Thr Thr Asn Lys Thr Tyr Thr Ile-';Arg Ser Cys 
530 535 540 

Tyr Ser Gly Arg Val Ser Ala Ala Tyr His Lys Glu Ala Pro Glu Pro" 

545 550 . ' ■ 555 ' . 560 



^la Leu Leu Tyr Arg Asn lie Asn Cys -Ser Tyr Val' Phe Thr Asn. Asn 

565 570 575 



lie Ser Arg Glu Glu Asn Pro Leu Asn Tyr Phe Asp Ser Tyr Leu Gly 

580 585 590 



Cys Val Val Asn Ala Asp Asn Arg Thr Asp Glu Ala Leu Pro Asn Cys 
595 600 605 



Asn Leu Arg Met Gly Ala Gly Leu Cys Val Asp Tyr Ser Lys Ser Arg 
610 615 620 



Arg Ala Arg Arg Ser Val Ser Thr Gly -Tyr Arg Leu Thr Thr Phe Glu • 
625 630 635 * 640 • 



V 



• I 



Pro Tyr Met Pro Met Leu Val Asn Asp Ser Val Gin Ser Val Gly' Gly 

645 ' 650 655 



Leu Tyr Glu Met Glii lie Pro Thr Asn Phe Thr lie Gly His His Glu 

660 665 670 



Glu* Phe He Gin He Arg Ala Pro Lys Val Thr He Asp Cys Ala Ala 
675 680 685 



Phe Val. Cys Gly Asp Asn Ala Ala Cys Arg Gin Gin Leu Val Glu Tyr 
690 695 700 



Gly Ser Phe. Cys Asp Asn Val Asn Ala He Leu Asn Glu Val Asn Asn- 
705 710 715 720 



Leu Leu Asp Asn Met Gin Leu Gin Val Ala Ser Ala Leu Met Gin Gly 

725 730 735 



Val Thr He Ser Ser Arg Leu Pro Asp Gly He Ser Gly Pro He Asp 

740 745 750 



Asp He Asn Phe Ser Pro Leu Leu Gly Cys He Gly Ser Thr Cys Ala 
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■ * 



755 760 765 



Glu ftsp Giy Asn Gly Pro Ser Ala lie 3^g Gly Arg Ser Ala lie Glu 
770 775 780 

< 

Asp Leu Leu Phe Asp hys Val Lys Leu Ser Asp Val Gly Phe Val Glu 

785 . 790 795 . ( 800 



Aia Tyr Asn Asn Cys Thr Gly Gly Gin Glu Val Arg Asp Leu Leu Cys 

•* 805 810 . 815 



Val Gin Ser Phe Asn Gly He Lys Val Leu Pro- Pro Val Leu Ser Glu 

• 820 825 830 



Ser Gin He Ser Gly Tyr" Thr Ala Gly Ala Thr Ala Ala Ala Met Phe 
835 840 845 



Pro Pro Trp Thr Ala Ala Ala Gly Val Pro Phe Ser Leu Asn Val Gin 
850 855 860 



Tyr Arg He Asn Gly I*e'u Gly Val Thr Met Asn Val Leu Ser Glu Asn 
865 870 875 880 

4 ■ 

I 

Gin Lys Met lie Ala Ser Ala Phe Asn Asn Ala Leu Gly Ala He Gin 

.885 * 890 , . 895 ' 



Glu Gly Phe Asp Ala Thr Asn Ser Ala Leu Gly .Lys He Gin Ser Val 

900 • 905 ' 910 



Val Asn Ala Asn Ala Glu Ala Leu Asn Asn Leu Leu Asn Gin Leu Ser 
915 920 925 



Asn Arg Phe Gly Ala He Ser Ala Ser Leu Gin Glu He Leu- Tbr Arg 
930 . 935 940 



Leu Asp Ala Val Glu Ala Lys Ala Gin He Asp Arg Leu He Asn Gly 
945 950 955 960 



Arg Leu Thr Ala Leu Asn Ala Tyr He Ser Lys Gin Leu Ser Asp Ser 

965 970 975 



Thr Leu He Lys Phe Ser Ala Ala Gin Ala He Glu Lys Val Asn Glu 

980 985 990 



Cys Val Lys Ser Gin Thr Thr Arg He Asn Phe Cys Gly Asn Gly Asn 
995 1000 1005 
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His He Leu Ser Leu Va'l Gin Asn Ala Pro Tyr Gly Leu Qys Phe 
1010 1015 1020 . 



He His Phe Ser .Tyr Val Pro Thr Ser Phe Lys Thr' Ala Asn Val 
1025 . 1030 1035 



Ser Pro Gly Leu Cys He Ser Gly Asp Arg Gly Leu Ala Pro Lys 
1040 1045 1050 



Ala Gly Tyr Phe Val Gin Asp Asn Gly Glu Trp Lys Phe Thr Gly 
1055 1060 1065 



Ser Asn Tyr Tyr Tyr Pro Glu Pro lie Thr Asp Lys Asn Ser Val 
1070 ' 1075 1080 



Ala Met He Set Cys. Ala Val Asn Tyr Thr Lys Ala Pro Glu Val 
1085 1090 1095- 



Phe Leu Asn Asn Ser He Pro Asn Leu Pro Asp Phe Lys -Glu' Glu 
1100 1105 . 1110 



Leu Asp Lys Trp Phe Lys Asn Gin » Thr .Ser He Ala Pro Asp Leu 
' 1115 • • . ' 1120 1125 



Ser Leu • Asp Phe Glu Lys Leu Asn Val Thr Phe Leu- Asp Leu Thr 
ilSO . 1135 1140 



Tyr Glu Met Asn Arg He Gin Asp Ala He Lys Lys Leu Asn Glu 
- 1145 1150 1155 



Ser Tyr He Asn Leu Lys Glu Val Gly Thr Tyr Glu Met Tyr Val 
1160 1165 1170 



Lys Trp Pro Trp Tyr Val Trp Leu Leu He Gly Leu Ala Gly Val 
1175 • 1180 1185 



Ala Val Cys Val Leu Leu Phe Phe He Cys .Cys Cys Thr Gly Cys 
1190 1195 . 1200 



Gly Ser Cys Cys Phe Arg Lys Cys Gly Ser Cys Cys Asp Glu Tyr 
1205 1210 , 1215 



Gly Gly His Gin Asp Ser He Val He His Asn He Ser Ala His 
1220 1225 1230 
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Glu Asp 
1235 



<210> 58 

<211> 1363 • ' 

<212>" PRT • ; . • 

<213> human coronavirus | 

<400> 58 

* 

Met Phe Leu He Leu Leu lie Ser Leu Pro Met Ala Leu Ala Val.Ile 
1 5 . 10 15 . 



Gly Asp Leu Lys Cys Thr Thr Val Ala He Asn Asp Val Asp Thr Gly 

20 25 30 



Val Pro Ser Thr Ser Thr Asp He Val Asp Val Thr Asn Gly Leu Gly 
35 40 45 



Thr Tyr Tyr Val Leu Asp Arg Val Tyr Leu Asn Thr Thu Leu Leu Leu 
50 55 . 60 



Asn Gly Tyr Tyr Pro Thr Ser Gly Ser Thr Tyr Arg Asn Met Ala Leu* • 
65 70 .75 .80 



Lys Gly Thr Leu Leu Leu Ser Arg Leu Trp Phe Lys Pro Pro Phe Leu 

85 90 ■ . 95 • 



Ser Asp Phe He Asn Gly He Phe Ala Lys Val Lys Asn Thr Lys Val 

100 105 110 ' 



He Lys His Gly Val Met Tyr Ser Glu Phe Pro Ala He Thr He Gly 
115 120 125 



Ser Thr Phe Val Asn Thr Ser Tyr Ser Val Val Val Gin Pro His Thr 
130 135 140 



Thr Asn Leu -Asp Asn Lys Leu Gin Gly Leu Leu Glu He Ser Val Cys 
145 ' 150 155 160 



Gin Tyr Thr Met Cys Glu Tyr Pro Asn Thr He Cys His Pro Asn Leu 

165 170 175 



Gly Asn Arg Arg Val Glu Leu Trp His Trp Asp- Thr. Gly Val Val Ser 

180 185 190 



Cys Leu Tyr Lys Arg Asn Phe Thr Tyr Asp Val Asn Ala Asp Tyr Leu 
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195 . 200 • . 205 • ' . 

Tyr Phe His Phe Tyr Gin Glu Gly Gly lie Phe Tyr Ala tyr Phe Thr 
210 ' 215 220 

. * 

- 

Asp Thr Gly Val'*V9l Thar Lys Phe Leu Phe Asn Val Tyr Leu Gly Thr- 
225 . • ' 230 235 240 



Val Leu Ser Tyr Tyr Tyr Val Met Pro Leu Thr Cys Asn Ser Ala Met 

245 .250 255 



Thr Leu Glu Tyr Trp Val Thr Pro Leu Thr Ser Lys Gin. Tyr Leu Leu 

260 265 270 



Ala Phe Asn Gin Asp Gly Val lie Phe Asn Ala Val Asp Cys Lys Ser 
275 280 265 



Asp Phe Met Ser Glu He Lys Cys Lys Thr Leu Ser He Ala Pro Ser 
290 295 • 300 



Thr Gly Val Tyr Glu Leu Asn Gly Tyr Thr Val Gin Pro He Ala Asp 

305 310 . 315 320 

Val Tyr Arg Arg He Pro Asn Leu Pro Asp Cys Asn lie Glu Ala Trp 

. 325 * 330 335 



Leu Asn Asp Lys Ser -Val Pro Ser Pro Leu Asn Trp Glu Arg Lys Thr 

340 345 - 350 



Phe Ser Asn Cys Asn Phe Asn Met Ser Ser Leu Met Ser Phe He Gin 
355 360 365 



Ala Asp Ser Phe Thr Cys Asn Asn He Asp Ala Ala Lys He- Tyr Gly 
370 . 375 380 



Met Cys Phe Ser Ser He Thr He Asp Lys Phe Ala lie Pro Ash Gly 
365 . 390 395 400 



Arg Lys Val Asp Leu Gin Leu Gly Asn Leu Gly Tyr Leu Gin Ser Phe 

405 410 415 



Asn Tyr Arg He Asp Thr Thr Ala Thr Ser Cys Gin Leu Tyr Tyr Asn 

420 425 430 



Leu Pro Ala Ala Asn Val Ser Val Ser Arg Phe Asn Pro Ser He Trp 
435 440 .445 
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Asn Arg Axg Phe Gly Phe Thr Glu Gin Ser .Val Phe Lys Pro Gin Pro 
450 • * 455 460 



Ala Gly Val* Phe Thr Asp* His Asp Val Val Tyr Ala Gin His Cys Phe 
465 * 470 475 '480 

Lys Ala Pro Thr Asn Phe Cys Pro Cys Lys Leu Asp Gly Sef Leu Cys 

485 490 . .495 



Val Gly Asn Gly Pro Gly lie Asp Ala Gly Tyr Lys. Asn Ser Gly lie 

500 505 510 



Gly Thr Cys Pro Ala Gly Thr Asn Tyr Leu Thr Cys His Asn Ala Val 
515 520 525 



Gin Cys Asn Cys Leu Cys Thr Pro Asp Pro lie Thr Ser Lys Ser Thr 
530 • 535 540 



Gly Pro Tyr Lys Cys Pro Gin Th;r Lys Tyr Leu Val Gly lie Gly Glu. 
545 550 . • 555 . • • 560 



His C^s Ser Gly Leu Ala He. Lys Ser Asp Tyr Cys Gly Gly Asn Pro 
v' * 565 570 575 



Cys Thr Cys Gin Pro Gin Ala Phe Leu Gly Trp Ser Val Asp Ser Cys 

580 585 590 



Xieu Gin Gly Asp Arg Cys Asn He Phe Ala Asn Phe He Leu His Asp- 
595 600 605 



Val Asn Ser Gly Thr Thr Cys Ser Thr Asp Leu Gin Lys Ser Asn Thr 
610 615 620 



Asp He He .Leu Gly Val Cys Val Asn Tyr Asp Leu Tyr Gly He Thr 
625 630 635 '640 



■ • 



Gly Gin Gly He Phe Val Glu Val Asn Ala Pro Tyr Tyr Asn Ser Trp 

645 650 655 ■ 



Gin Asn Leu Leu Tyr Asp Ser Asn Gly Asn Leu Tyr Gly Phe Arg Asp 

660 665 670 



Tyr Leu Thr Asn Arg Thr Phe Met He Arg Ser Cys Tyr Ser Gly Arg 
675 680 685 
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Val Ser Ala Ala Phe His Ala Asn Ser Ser Glu Pro Ala Leu Leu Phe 
690 ' 695 ' 700. 



Arg.Asn- lie Lys Cys Asn Tyr Val Phe Asn Asn Thr Leu Ser Arg Gin 
705 / 710 715 -720 



Leu Gin Pro lie Asn Tyr Phe Asjp Ser Tyr Leu Gly Cys Val Val Asn 

725 730 . 735 



Ala Asp Asn Ser Thr Ala * Ser Ala Val Gin Thr Cys Asp Leu Thr Val 

740 745 ■ 750 ' 



Gly Ser Gly Tyr Cys Val. Asp Tyr Ser Thr Lys Arg Arg Ser Arg Arg 
755 ' 760 • 765 



Ala lie Thr Thr Gly Tyr Arg Phe Thr Asn Phe Glu Pro Phe Thr Val 
•770 * * 775 780 



Asn Ser Val Asn Asp 5er Leu Glu His Val Gly Gly Leu Tyr Glu lie 
785 '790 795 ' ■ ' 800 



Gin He Pro Ser Glu Fhe Thr He Gly Asn Met Glu Glu Phe He Gin 

• ' 805 ■ • -810 ' ' '815 



■ * 



Thr Ser Ser Pro Lys Val Thr He Asp Cys Ser Ala Phe Val Cys Gly 

820 825 830 



Asp Cys Ala Ala Cys Lys Ser Gin Leu Val Glu Tyr Gly Ser Phe Cys 

835 ■840 » 845 



Asp Asn He Asn Ala He Leu Thr Glu Val Asn Glu Leu Leu Asp Thr 
850 855 860 



Thr Gin Leu Gin Val Ala Asn Ser Leu Met Asn Gly Val Thr Leu Ser 
865 . 870 ' 875 880 



Thr Lys Leu Lys Asp Gly Val Asn Phe Asn Val Asp Asp Val Asn Phe 

88? . 890 895 



Ser Pro Val Leu Gly Cys Leu Gly Ser Glu Cys Asn Lys Val Ser Ser 

" 900 • 905 910 



Arg Ser Ala He Glu Asp Leu Leu Phe Ser Lys Val Arg Leu Ser Asp 
915 ' 920 925 
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Val Gly Phe Val Glu Ala Tyr Asn Asn Cys Thr Gly Gly Ala Gly He 
930 '935 . 940 



Arg Asp Leu He Cys Val Gin Ser Tyr. Asn Gly He Lys ".Val Leu Pro 
945 950 * 955 . 960 



Pro Leu Leu Ser Asp Asn Gin lie Ser Gly Tyr Thr Leu Alai Ala Thr' 

9.65 970 . ' 975 



5^r Ala Asn Leu Phe Pro Pro Trp Ser- Ala Ala Ala Gly V&l Pro.fhe 

980 985 . 990 



Tyr Leu Asn Val Glri Tyr Arg He Asn Gly He Gly Val ■ Thr Met Asp 
995 1000 1005 



Val Leu Ser Gin Asn Gin Lys Leu He Ala Asn Ala The Asn Asn 
' • lOiO 1015 1020 



Ala Leu Asp Ala He Gin Glu ' Gly Phe Asp Ala Thr Asn Ser Ala 
1025 ' 1030 1035 



Leu Val Lys He Gin Ala Val Val As-n A1& Asp Ala Glu Ala Leu » 
1040 1045 * 1050 



* 1. 



Asn Asn Leu Leu Gin Gin Leu Ser Asn Arg Phe Gly Ala He Ser 
1055 1060 1065 . 



Ser Ser Leu Gin Glu 'He Leu Ser Arg Leu Asp Ala Leu Glu Ala 
1070 1075 1080 



Gin Ala Gin He Asp Arg Leu lie Asn Gly Arg Leu Thr' Ala Leu 
1085 1090 1095 * 



Asp Ala Tyr Val- Ser Gin Gin Leu Ser Asp Ser Thr Leu Val Lys 
1100 1105 • 1110 



Phe Ser Ala Ala Gin Ala Met Glu Lys Val Asn Glu Cys Val Lys 
ills 1120 1125 



Ser Gin Ser Ser Arg He Asn Phe Cys Gly Asn Gly Asn His He 
1130 1135 1140 - 



He Ser Leu Val Gin Asn Ala Pro Tyr Gly Leu Tyr Phe He His 
1145 1150 1155 



Phe Ser Tyr Val Pro Thr Lys Tyr Val Thr Ala Lys Val Ser Pro 

125 



wo 2004/096842 PCT/CA2004/000626 

» * 

• • • 

1160 1165 il70 



Gly Leu Cys lie Ala Gly Asp Arg Gly lie Ala Pro Lys Ser Gly ■ . 
1175 ' 1180 1185 



Tyr Pfie Val Ash'*'Val Asn Asn Thr Trp Met Phe Thr Gly Ser Arg . 
1190' 1195 1200 



Tyr Tyr Tyr Pro Glu Pro He Thr Gly Asp Asn Val Val Val Met 
. . 1205 1210 1215 



Ser Thr Cys Ala Val Asn Tyr Thr Lys Ala Pro Asp Val Met .Leu 
1220 1225 1230 ' 



t ft 



Asn lie Ser Thr Pro Asn Leu- Pro Asp Phe Lys Glu Glu Leu Asp 
1235 1240 1245 * 



Gin Trp Phe Lys Asn Gin Thr Leu Val Ala Pro Asp Leu Ser Leu 
1250 1255 • * 1260 



Asp Tyr He Asn Val Thr Phe . Leu Asp Leu Gin Asp • G1.U Met Asn 
1265 1270 • 1275 



4« 



Arg T^jeu " Gin Glu Ala He Lys Val Leu Asn Gin Ser Tyr IIq Asn 
1280* 1285 1290 



Leu. Lys Asp He Gly Thr Tyr Glu Tyr Tyr Val Lys Trp Pro Trp 
1295 ' 1300 1305 



Tyr Val Trp Leu Leu He Gly Phe Ala Gly Val Ala Met Leu Val 
1310 1315 1320 



Leu Leu Phe Phe He Cys Cys Cys Thr Gly Cys Gly Thr Ser Cys 
.1325 ■ 1330 1335 



Phe Lys Lys Cys Gly Gly Cys Cys Asp Asp Tyr Thr Gly His Gin 
1340 1345 1350 



Glu Leu Val He Lys Thr Ser His Glu Gly 
1355 1360 



<210> 59 

<211> 1383 

<212> PRT 

<213> Porcine epideiaic diarrhea virus 

<400> 59 
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Met Arg Ser Leu lie Tyr Phe Trp Leu Leu Leu Pro Val Leu Pro Thr 
1 5 . 10 ■ .15 



Leu Ser - Leu Pro Gin Asp Val Thr Arg Cys Gin Ser Thr Thr Asn Phe 

20 , 25 30 ' 



Arg Arg Phe Phe Ser Ijys Phe Asn Val Gin Ala Pro Ala Val Val Val 
35 40 .45 



Leu Gly Gly Tyr Leu Pro Ser Met Asn Ser Ser Ser Trp Tyr Cys Gly 
50 55 60 



Thr Gly lie Glu Thr Ala Ser- Gly Val His Gly He Phe Leu Ser Tyr 
65 70 75 ■ 80 



He Asp Ser Gly Gin Gly Phe Glu He Gly He Ser Gin Glu Pro Phe 

85 90. 95 



Asp Pro Ser Gly Tyr Gin Leu Tyr Leu His.Lys Ala Thr Asn Gly Asri 
- 100 105 110 



Thr Asn Ala Thr Ala Arg Leu Arg He Cys Gin Phe Pro Asp Asn Lys 
115 120 > 125 



Thr Leu Gly Pro Thr Val Asn Asp Val Thr Thr Gly Arg Asn Cys Leu 
130 • 135 140 



Phe Asn Lys Ala He Pro Ala Tyr . Met Arg Asp Gly Lys Asp He Val 
145 150 . 155 ■ . 160 



Val Gly He Thr Trp Asp Asn Asp Arg Val Thr Val Phe Ala Asp Lys 

165 170 175 



He Tyr His Phe Tyr Leu Lys Asn Asp Trp Ser Arg Val Ala Thr Arg 
. 180 185 ' 190 



Cys Tyr Asn Arg Arg Ser Cys Ala Met Gin Tyr Val Tyr Thr Pro Thr 
195 200 205 



Tyr Tyr Met Leu Asn Val Thr Ser Ala Gly Glu Asp Gly He Tyr Tyr 
210 215 220 



Glu Pro Cys Thr Ala Asn Cys Thr Gly Tyr Ala Ala Asn Val Phe Ala 
225 230 235 240 
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Thr Asp- Ser Asn Gly His lie Pro Glu. Gly Phe Ser Phe Asn Asn Trp 

245 ' * 25G . . • • 255 



Phe Leu Leu Ser Asn Asp Ser Thr Leu Xeu His Gly Lys- .yal Val Ser 

260 265 • 270 * 

1 1 -i' 



Asn Gin Pro Leu Leu Val Asn Cys Leu Leu Ala lie Pro Lys lie Tyr' 
275 280 285 



CJly Leu Gly Gin Phe Phe Ser Phe Asn -His Thr Met -Asp Gly Val .Cys 
290 . 295 . 300 . 



Asn Gly Ala Ala Val Asp Arg Ala Pro Glu Ala Leu Arg* Phe Asri He 
305 310 • -315 320 



Asn Asp Thr Ser Val He Leu Ala Glu Gly Ser He Val" Leu His Thr 

325 330 335 



Ala Leu Gly Thr Asn Leu Ser Phe Val Cys Ser Asn Ser Ser Asp Pro 

340 345 .350 



His Leu Ala lie Phe Ala He Pro Leu Gly Ala Thr Glu Val Pro Tyr 
355 . 360 365 



Tyr Cys Phe Leu Lys Val Asp Thr Tyr Asn Ser Thr Val Tyr Lys Phe 
370 375 380 



Leu Ala Val Leu Pro Ser Thr Val Arg Glu He Val He* Thr Lys Tyr • 
385 • 390 395 400 • 



Gly Asp Val Tyr Val Asn Gly Phe Gly Tyr Leu His Leu Gly Leu Leu 

405 -410 415 



Asp. Ala Val Thr He Tyr Phe Thr Gly His Gly Thr Asp Asp Asp Val 

420 ■ 425 430 



Ser Gly Phe Trp Thr He Ala Ser Thr Asn Phe Val Asp Ala Leu He 
435 440 ' 445 



Glu Val Gin Gly Thr Ser He Gin Arg He Leu Tyr Cys Asp Asp Pro 
. 450 455 460 



Val Ser Gin Leu Lys Cys Ser Gin Val Ala Phe Asp. Leu Asp Asp Gly 
465 470 475 460 



Phe Tyr Pro He Ser Ser Arg Asn Leu Leu Ser His Glu Gin Pro He 
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485 490 495 



Ser Phe Val Thr Leu Pro Ser Phe Asn Asp His Ser Phe Val Asn lie 

- 500 5*05 '510 



Thr Val Ser Ala* Ala Phe Gly Gly Leu Ser Ser Ala Asn Leu 'Val Ala 
515 520 , 525 ( 



Ser Asp Thr Thr lie Asn Gly Phe Ser Ser Phe Cys Val Asp Thr Arg 
530 535 .540 



Gin Phe Thr lie Thr Leu Phe Tyr Asn Val Thr Asn Ser Tyr-Gly Tyr 
545 550 555 560 



Val Ser Lys Ser Gin Asp Ser Asn Cys Pro Phe Thr Leu Gin Ser Val 

565 570 575 



Asn Asp Tyr Leu Ser Phe Ser Lys Phe Cys Val Ser Thr Ser Leu Leu 

580 ' 585 590 



Ala Gly Ala Cys Thr He Asp Leu Phe Gly Tyr Pro Ala Phe Gly Ser 
595 600 605 



Gly Val Lys Leu Thr Ser Leu Tyr Phe Gin Phe ' Thr Lys Gly Glu Leia 
610 .615 620 . 



He Thr Gly Thr Pro Lys Pro Leu Glu Gly lie Thr Asp Val Ser Phe 
625 630 635 640 



Met Thr Leu Asp Val Cys Thr Lys Tyr Thr He Tyr Gly Phe Lys Gly 

645 650 655 



Glu Gly He He' Thr Leu Thr Asn Ser Ser He Leu Ala Gly Val Tyr 

660 665 670 



Tyr Thr Ser Asp Ser Gly Gin Leu Leu Ala Phe Lys Asn Val Thr Ser 
675 680 685 



Gly Ala Val Tyr Ser Val Thr Pro Cys Ser Phe Ser Glu Gin Ala Ala 
690 ' 695 700 



Tyr Val Asn Asp Asp He Val Gly Val lie Ser Ser Leu Ser Asn Ser 
705 710 715 720 



Thr Phe Asn Asn Thr Arg Glu Leu Pro Gly Phe Phe Tyr His Ser Asn 

725 730 735 
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Asp Gly Ser Asn Cys Th^' Glu Pro Val Leu Val Tyr Ser Asn lie Gly 

• 740 . 745 750 

Val Cys Lys* Ser Gly Ser* lie Gly Tyr Val Pro Ser Gin Tyr Gly Gin 

•' 755 ' • . 760 • 765 ■ • * 



Val Lys lie Ala Pro Thr Val Thr Gly Asn lie Ser lie Pro Thr Ash 
770 . 775 .780 . 



Phe Ser Met Ser lie Arg Thr Glu Tyr Leu Gin Leu. Tyr Asn Thr Pro 
785 . 790 795 800 



Val Ser Val Asp Cys Ala Thr Tyr Val Cys Asn Gly Asn Ser Arg Cys 

805 810 815.. 



Lys Gin Leu Leu Thr Gin Tyr Thr Ala Ala Cys Lys Thr He Glu Ser 

820 825- 830 



Ala Leu Gin Leu Ser Ala Arg Ley Glu Ser Val Glu Val Asn Ser Met . 
.835 840 845 . 

• * - . 

Leu Thr lie Ser Glu Glu Ala Leu Gin Leu Ala Thr lie Ser Ser Phe 
950 • . 855 860 • 



* « 



Asn Gly Asp Gl^ Tyr Asn Phe Thr Asn Val Leu Gly Ala Ser Val Tyr 
865* ,-870 875 880 



^sp Pro Ala- Ser Gly Arg Val Val Gin Lys Arg Ser Val "He Glu Asp 

885 890 895 . 



Leu Leu Phe Asn Lys Val Val Thr Asn Gly Leu Gly Thr Val Asp Glu 

900 905 910' 



Asp Tyr Lys Arg Cys Ser Asn Gly Arg Ser Val Ala Asp Ley Val Cys 
915 920 925 



Ala Gin Tyr Tyr Ser Gly Val Met Val Leu Pro Gly Val Val Asp Ala 
930 935 940 



Glu Lys Leu His Met Tyr Ser Ala Ser Leu He Gly Gly Met Ala Leu 
94-5 950 955 * 960 



Gly Gly He Thr Ala Ala Ala Ala Leu Pro Phe Ser Tyr Ala Val Gin 

965 970 975 
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Ala Arg Leu Asn Tyr Leu Ala Leu Gin Thr Asp Val Lfeu Gin Arg Asn 

980 985 • . 990 . 



Gin Gin. Leu Leu Ala Glu Ser Phe Asn Ser Ala lie Gly Asn. lie Thr 
• 995 . 1000 1005 



Ser Ala Phe Glu Ser Val Lys Glu Ala lie Ser Gin Thr Ser Lys 
1010 1015 1020 



Gly Leu Asn Thr Val Ala His Ala Leu Thr Lys Val Gin Glu Val 
1025 1030 1035 



Val Asn Ser Gin Gly. Se.r Ala Leu Asn Gin Leu Thr Val Gin Leu 
1040 1045 . 1050 • 



Gin His Asn Phe Gin Ala He Ser Ser Ser He Asp Asp He Tyr 
1055 1060 . 1065 



Ser Arg Leu Asp He Leu Leu Ala Asp Val Gin Val; Asp Arg Leu 
1070 . 1075 1080 



He Thr Gly Arg Leu Ser Ala Leu Asn Ala Phe Val Ala Gin Thr 
1085. 1090 » • 1095 



Leu Thr Lys Tyr Thr Glu Val Gin Ala Ser Arg Lys Leu Ala Gin 
1100 1105 1110* 



Gin Lys Val Asn Glu Cys Val Lys Ser Gin Ser Gin Arg Tyr Gly 
1115 * 1120 1125 



Phe Cys Gly Gly Asp Gly Glu His He Phe Ser Leu ' Val Gin Ala 
1130 1135 1140 



Ala Pro Gin Gly Leu Leu Phe Leu His Thr Val Leu ' Val Pro Gly 
1145 115d ' 1155 



Asp Phe Val Asn Val Leu Ala He Ala Gly Leu Cys Val Asn Gly 
1160 1165 1170 



Glu He Ala Leu Thr Leu Arg Glu Pro Gly Leu Val Leu Phe Thr 
1175 1180* 1185 



His Glu Leu Gin Thr Tyr Thr Ala Thr Glu Tyr Phe Val Ser Ser 
1190 1195 1200 
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Arg Arg Met Phe Glu Pro Arg Lys Pro Thr Val - Sef Asp Phe Val 
1205 1210 .1215' 



Gin He Glu Ser Cys Val Val . Thr Tyr Val Asn Leu Thr Ser Asp 
■ 1220 . • 1225' 1230- ' 

I 

Gin Leu ' Pro Asp Val lie Pro Asp Tyr He Asp Val Asn Lys Tbr 
1235 1240 * * 1245 



Leu Asp Glu He Leu Ala Ser Leu Pro Asn Aa;g Thr Gly Pro Ser 
1250 1255 1260 



Leu Pro Leu Asp Val Phe Asn Ala Thr Tyr Leu Asn Leu Thr Gly 
1265 . 12'70 1275 



Glu He .Ala Asp Leu Glu Gin Arg. Ser Glu Ser Leu Arg Asn *rhr 
1280 1285 • 1290 

Thr Glu Glu Leu Arg Ser Leu He Asn Asn He Asn Asn Thr Leu 
1295 1300 1305 

* ■ 

« 

Val Asp Leu Glu Trp Leu Asn Arg Val Glu Thr Tyr He Lys Trp 
. 1310 1315 1320 



Pro Trp Trp V^l Trp Leu He He Val He Val Leu He Phe Val' 
1325 • 1330 1335 



Val Ser Leu Leu Val Phe Cys Cys He Ser Thr Gly Cys Cys Gly 
1340 1345 1350 



Cys Cys Gly Cys Cys Gly Ala Cys Phe Ser Gly Cys Cys Arg Gly 
1355 1360 1365. 



Pro Arg Leu Gin Pro Tyr Glu Ala Phe Glu Lys" Val His Val Gin 
1370 1375 1380 



<210> 60 ■ 

<211> 1349 

<212> PRT 

<213> porcine hemagglutinating encephalomyelitis virus 

<400> 60 

Met Phe Phe He Leu Leu He Ser Leu Pro Ser Ala Phe Ala Val He 
1 5 10 ,15 



Gly Asp Leu Lys Cys Thr Thr Ser Leu He Asn Asp Val Asp Thr Gly 

20 25 30 
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Val Pro Ser lie Ser. Se.r Glu Val Val. Asp .Val Thr Asn Gly Leu Gly 
35 40 45 • 



Thr Phe Tyr Val Leu Asp' Arg Val Tyr Leu Asn Thr Thr Leu Leu Leu 
SO 55 60 

\ 

• m 

Asn Gly Tyr Tyr Pro He Ser Gly Ala Thr Phe Arg Asn Met Ala Leu 
65 70 .75 . . 80 



Lys Gly Thr Arg Leu Leu Ser Thr Leu Trp Phe X^ys.Pro Pro .Phe Leu 

85 90 95. 



Ser Pro Phe Asn Asp Gly He Phe Ala Lys Val Lys Asn Ser Arg Phe 

100 105 110 



Ser Lys Asp Gly Val He Tyr Ser Glu Phe Pro Ala He Thr lie Gly 
115 . 120 ■ . . 125 



Ser Thr Phe Val Asn Thr Ser Ty^: Ser He Val Val Glu Pro His Thr 
130 135 140 



Ser Leu He Asn Gly Asn Leu Gin Gly Leu Leu Gin He Ser Val Cys 
145 150 155 . • ■ 160 



Gin Tyr Thr Met Cys Glu Tyr Pro His Thr He Cys His Pro Asn Leu 

165 170 175 



eiy Asn Gin Arg He Glu Leu Trp His Tyr Asp Thr Asp Val Val Ser 

180 185 190 



Cys Leu Tyr Arg Arg Asn. Phe Thr Tyr Asp Val Asn Ala Asp Tyr Leu 
195 • 200 205 



Tyr Phe His Phe Tyr Gin Glu Gly Gly Thr Phe Tyr Ala Tyr Phe ,Thr 
210 215 ■ 220 



Asp Thr Gly Phe Val Thr Lys Phe Leu Phe Lys Leu Tyr Leu Gly Thr 
225 230 235 240 



Val Leu Ser His Tyr Tyr Val Met Pro Leu Thr Cys Asn Ser Ala Leu 

245 250 255 



Ser Leu Glu Tyr Trp Val Thr Pro Leu Thr Thr Arg Gin Phe Leu Leu 

260 265 270 
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Ala Phe Asp Gin Asp Gly Val Leu Tyr His Ala Val Asp Cys Ala Spr 
275 • . . 280 • .285 



Asp Phe. Met Ser Glu lie Met Cys Lys Thr Ser Ser Ile 'Thr Pro Pro 

290 , 295 300 

I . . . , 

* 

Thr Gly Val Tyr Glu Leu Asn Gly Tyr Thr Val * Gin Pro Val Ala Thr 
305 310 315 320 



Val Tyr Arg Arg lie. Pro Asp Leu Pro Asn Cys Asp lie Glu ;^a Trp 

325 . 330 . 3'35 



Leu A^n Ser Lys Thr Val. Ser Ser Pro Leu Asn Trp Glu Arg Lys lie 

340 345 ■ 350 



Phe Ser Asn Cys Asn Phe Asn Met Gly- Arg Leu Met Ser Phe lie Gin 
355 ' ' 360 365 



Ala Asp Ser Phe Gly Cys Asn Asn He Asp Ala Ser Arg Leu Tyr Gly 
370 375 380 " 



Met Cys Phe Gly Ser He Thr He Asp Lys Phe Ala lie Prd Asn Ser 
385 390 » 395 400 



Arg- Lys Val Asp Leu Gin Val Gly Lys Ser Gly Tyr Leu Gin Ser Phe 

• 405 410 415 



I 



Asn Tyr Lys He Asp Thr Ala Val- Ser Ser Cys Gin Leu Tyr Tyr Ser 

420 * . • 425 430 



Leu Pro Ma Ala Asn Val Ser Val Thr His Tyr Asn Pro Ser Ser Trp 
435 • 440 445 



Asn Arg Arg Tyr Gly Phe Asn Asn Gin Ser Phe Gly Ser Arg Gly Leu 
'450 455 ' 460 



His Asp Ala Val Tyr Ser Gin Gin Cys Phe Asn Thr Pro Asn Thr Tyr 
465 470 475 460 



Cys Pro Cys Arg Thr Ser Gin Cys He Gly Gly Ala Gly Thr Gly Thr 

485 490 • 495 



Cys Pro Val Gly Thr Thr Val Arg Lys Cys Phe Ala Ala Val Thr Lys 

500 505 510 
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* * ■ 

Ala Thr hys Cys Thr Cys Trp Cys Gin Pro Asp Pro Ser Thr Tyr Lys 
515 . ' 520 • - • 525 • 



Gly Val Asn Ala Trp Thr Cys Pro Gin Ser I*ys Val Ser- .lie Gin Pro 
530 535 540 



Gly Gin His Cys Pro Gly Leu Gly Leu Val Glu Asp Asp Cys jSer Gly 
545 . * 550 555 • 560 



Asn Pro Cys Thr Cys Lys Pro Gin AlaPhe lie Gly* Trp Ser Ser.dlu 

565. 570 575 



Thr Cys Leu Gin Ash Gly Arg Cys Asn He Phe Ala Asn Phe He Leu 

580 585 ' 590 



Asn Asp Val Asn Ser Gly Thr Thr Cys Ser Thr Asp Leu Gin Gin Gly 
595 600 605 • . 



■ Asn Thr He He Thr Thr Asp Val Cys Val Asn Tyr Asp Leu Tyr Gly 
610 615 620 



He Thr Gly Gin Gly He Leu lie Glu Val Asn Ala Thr Tyr Tyr Asn • 
625 , 630 . . 635 ' 640 ' 



< 



Ser Trp Gin Asn Leu Leu Tyr Asp Ser Ser Gly Asn Leu Tyr Gly Phe 

645 -650 655 



Arg Asp Tyr Leu Ser Asn Arg Thr Phe Leu He Arg Ser Cys Tyr Ser 

660 . 665 670 



Gly Arg Val Ser Ala Val Phe His Ala Asn Ser Ser Glu Pro Ala Leu 
675 • 680 685 



Met. Phe Arg Asn Leu Lys Cys Ser His Val Phe Asn Asn Thr He Leu 
690 695 700 



Arg Gin He Gin Leu Val Asn Tyr Phe Asp Ser Tyr Leu Gly Cys Val 
705 710 715 720 



Val Asn Ala Tyr Asn Asn Thr Ala Ser Ala Val Ser Thr Cys Asp Leu 

725 730 735 



Thr Val Gly Ser Gly Tyr Cys Val Asp Tyr Val Thr. Ala Leu Arg Ser 

740 745 750 



Arg Arg Ser Phe Thr Thr Gly Tyr Arg Phe Thr Asn Phe Glu Pro* Phe 
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755 760 . 765 



■ 



Ala Ala Asn Leu Val Asn Asp Ser lie Glu Pro Val Gly Gly Leu Tyr 
770 ■ ' 775 780 



Glu lie Gin He' i^ro Ser Glu Phe Thr lie Gly Asn Leu Glu Glu Fhe 

785 790 .795 • . 800 



He Gin Thr Arg Ser Pro Lys Val Thr He Asp Cys Ala Thr Phe Val 

805 810 815 



Cys Gly Asp Tyr Ala Ala Cys Arg Gin Gin Leu Ala Glu Tyr Gly Ser 

820 825 830 



Phe Cys Glu Asn He Asn* Ala He Leu Thr Glu Val Asn Glu Leu Leu 
835 840 845 



Asp Thr Thr Gin Leu Gin Val Ala Asn Ser Leu Met Asn Gly Val Thr 
850 855 860 



Leu Ser Thr Lys He Lys Asp Gly He Asn Phe Asn Val Asp Asp He 
865 870 . 875 880 

■ 

Asn Phe Ser Pro Val Leu Gly Cys Leu Gly Ser Glii Cys Asn Arg Ala 

. 885 ' ' 890 895 



Ser Thr Arg Ser Ale^'-Ile Glu Asp Leu Leu Phe Asp Lys Val Lys Leu 

900 905 * 910 



Ser Asp Val Gly Phe Val Gin Ala Tyr Asn Asn Cys Thr Gly Gly Ala 
915 ' 920 925 



Glu He Arg Asp" Leu He Cys Val Gin Ser Tyr Asn Gly He- Lys Val 
930 ■ 935 940 



Leu Pro Pro Leu Leu Ser Glu Asn Gin lie' Ser Gly Tyr Thr Leu Ala 
945 * . 950 955 960 



Ala Thr Ala Ala Ser Leu Phe Pro Pro Trp Thr Ala Ala Ala Gly Val 

965 970 975 



Pro Phe Tyr Leu Asn Val Gin Tyr Arg He Asn Gly Leu Gly Val Thr 

980 985 990 



Met Asp Val Leu Ser Gin Asn Gin Lys Leu He Ala Ser Ala Phe Asn 
995 1000 1005 
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Asn Ala Leu Asp. Ala lie Gin Glu Gly Phe Asp Ala Thr Asn Ser 
1010 • 1015' ■ 1020 



Ala Leu Val Lys lie Glh Ala .Val Val Asn Ala Asn Ala Glu Ala 
1025 1030 1035 

I 

Leu Asn Asn Leu Leu Gin Gin Leu Ser Asn Arg Phe Gly Ala He 
1040. 1045 • 1050 



M 

Ser Ala Ser Leu Gin Glu lie Leu Ser Arg Leu Asp Ala Leu Glu 
1055 . 1060 1065 



Ala Lys Ala Gin He Asp Arg Leu He Asn Gly Arg Leu Thr Ala 
1070 1075 1080 



Leu Asn Ala Tyr Val Ser Gin Gin Leu Ser Asp Ser Thr Leu Val 
1085 . 1090 1095 . 



Lys Phe Ser Ala Ala Gin Ala He Glu Lys Val Asn . Glu Cys Val 
1100 ' 1105 1110 



Lys Ser Gin Ser Ser Arg He Asn Phe Cys Gly Asn Gly Asn His 
U15 1120 1125 • 



He He Ser Leu Val Gin Asn. Ala Pro Tyr Gly Leu* Tyr Phe He 
1130 1135 1140 



IJis Phe Ser Tyr Val Pro Thr Lys Tyr Val Thr Ala Lys Val Ser • 
1145 1150 1155 



Pro Gly Leu Cys He Ala Gly Asp He Gly He Ser Pro Lys Ser 
1160 1165 1170 



Gly Tyr Phe He Asn Val Asn Asn Ser Trp Met Phe Thr Gly Ser 
1175 1180 1185 



Ser Tyr Tyr Tyr Pro Glu Pro He Thr Gin Asn Asn Val Val Val 
1190 ' 1195 1200 . 



Met Ser Thr Cys- Ala Val Asn Tyr Thr Lys Ala Pro Asp Leu Met 
1205 1210 1215 



• 



Leu Asn Thr Ser Thr Pro Asn Leu Pro Asp Phe Lys Glu Glu Leu 
1220 1225 1230 
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Tyr Gin Trp Phe Lys Asn GtLn. Ser Ser Val Ala Pro Asp Leu Ser 
1235 ' ' 1240 1245 



Leu Asp Tyr lie Asn Val Thr Phe Leu Asp Leu Gin Asp Glu Met 
1250 ■ 1255. 1260 ' 



Asn Arg Leu Gin Glu Ala. He Lys Val Leu Ash Gin Ser Tyr lie 
1265 1270 1275 



Asn Leu Lys Asp He Gly Thr Tyr Glu Tyr Tyr Val Lys Trp Pro 
1280 1285 1290 



Trp Tyr Val Trp Leu Leu He Gly Leu Ala Gly Val Ala Met Leu 
X295 1300 ' 1305 



Val Leu Leu Phe Phe He Cys Cys Cys Thr Gly Cys. Gly Thr Ser 
1310 * 1315 1320 



Cys Phe Lys Lys Cys Gly Gly Cys Cys Asp Asp Tyr Thr Gly His 
1325 1330 . 1335. 



Gin Glu PhQ Val He Lys Thr Ser His Asp- Asp 
' 1340 1345 



<210> 61 

<211> 1225 

<212> PRT 

<213> Porcine respiratory coronavirus 

<400> 61 ' . 

Met Lys Lys Leu Phe Val Val Leu Val Val Met Pro Leu He Tyr Gly 
1 5 10 15 . 



Asp. Lys Phe Pro Thr Ser Val Val Ser Asn Cys Thr Asp Gin Cys Ala 

20 25 30- 



Ser Tyr Val Ala Asn Val Phe Thr Thr Gin Pro Gly Gly Phe He Pro 
35 40 45 



Ser Asp Phe Ser Phe Asn Asn Trp Phe Leu Leu Thr Asn Ser Ser Thr 
50 55 60 



Leu Val Ser Gly Lys Leu Val Thr Lys Gin Pro Leu. Leu Val Asn Cys 
65 70 75 80 



• > 



Leu Trp Pro Val Pro Ser Phe Glu Glu Ala Ala Ser Thr Phe Cys Phe 
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* 

■ 

85 90 . 95 



Glu Gly Ala Asp Phe Asp Gin Cys Asn Gly Ala Val Leu Asn Asn Thr 

. 100 105 110 



Val Asp Val Ile Arg Phe Asn Leu Asn Phe Thr Thr Asn Val Gin Ser 
115 120 • . 125 I 

Gly Lys Gly Ala Thr Val Phe Ser Leu Asn Thr Thr Gly Gly Val Thr 
130 135 140 . 



Leu Glu lie Ser Cys Tyr Asn Asp Thr Val Ser Asp Ser Ser Phe Ser 
145 150 155 160 



Ser Tyr Gly Glu He Pro Phe Gly Val Thr Asn Gly Pro Arg Tyr Cys 

165 170 175 



Tyr Val Leu Tyr Asn Gly Thr Ala Leu Lys ..Tyr Leu Gly Thr Leu Pro 

180 185 190 



Pro Ser Val Lys Glu He Ala He Ser Lys Trp Gly His Phe Tyr He 
195 200 205 

« 

Asn Gly Tyr Asn Phe Phe Ser Thr Phe Pro He Asp Cys He Ser Phe 
210 215 220 . 



Asn Leu Thr Thr Gly Asp Ser Asp Val Phe Txp Thr He Ala Tyr Thr 
225 230 235 • 240 



Ser Tyr Thr Glu Ala lieu Val Gin Val Glu Asn Thr Ala He Thr Asn 

245 250 255 



Val Thr Tyr Cys" Asn Ser Tyr Val Asn Asn He Lys Cys Ser Gin Leu 

260 265 270 



Thr Ala Asn Leu Asn Asn Gly Phe Tyr Pro Val Set Ser Ser Glu Val 
275 280 285 



Gly Ser Val Asn Lys Ser Val Val Leu Leu Pro Ser Phe Leu Thr His 
290 " 295 300 



Thr. He Val Asn He Thr He Gly Leu Gly Met Lys Arg Ser Gly Tyr 
305 310 315 320 



Gly Gin Pro He Ala Ser Thr Leu Ser Asn He Thr Leu Pro Met Gin 

325 330 335 
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Asp Asn Asn Thr Asp Va.1 Tyr Cys Val Arg .Ser Asp Gin Phe Ser Val 

340 345 350 



Tyr Val His Ser Thr Cys' Lys Ser Ala Leu Trp Asp Asn Val Phe Lys 
355 360 365 



Arg Asn Cys Thr Asp Val Leu Asp Ala Thr Ala Val lie Lys Thr Gly 
370 375 380 . 



Thr Cys Pro Phe Ser Phe Asp Lys Leu Asn Asn Tyr. Leu Thr Phe Asn 
385 . 390 395 . 400 



Lys Phe Cys Leu Ser' Leu Ser Pro Val Giy Ala Asn Cys Lys Phe Asp 

405 410 415* 



Val Ala Ala Arg Thr Arg Thr Asn Glu Gin Val Val Arg Ser Leu Tyr 

. 420 425- .430 



Val lie Tyr Glu Glu Gly Asp Ser He Val Gly Val Pro Ser Asp Asn. 
435 4-40 . 445 . 



Ser Gly Leu His Asp.- Leu Ser Val Leu His Leu Asp* Ser Cys Thr Asp 
VSO 455 460 • 



Tyr Asn He Tyr Gly Arg Thr Gly Val Gly lie He Arg Gin Thr Asn 
465 * 470 475 480 



^rg Thr Leu Leu Ser Gly Leu Tyr Tyr Thr Ser Leu Ser Gly Asp Leu- 

485 490 495 . 



Leu Gly Phe Lys Asn Val. Ser Asp Gly Val He Tyr Ser Val Thr Pro 

500 * 505 510 • 



Cys Asp Val Ser Ala Gin Ala Ala Val He Asp Gly Thr He Val Gly 
515 ■ 520 . '525 



Ala He Thr Ser He Asn Ser Glu Leu Leu Gly Leu Thr His Trp Thr 
530 535 540 



He Thr Pro Asn Phe Tyr Tyr Tyr Ser He Tyr Asn Tyr Thr Asn Asp 
545 550 555 560 



Lys Thr Arg Gly Thr Pro He Asp Ser Asn Asp Val Gly Cys Glu Pro 

565 570 575 



* • 



« • 
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Val lie Thr Tyr Ser Asn He Gly Val Cys Lys Asn Gly Ala Lea Val 

580 • ' 585 • • . 590 . 



Phe He. Asn Val Thr His Ser Asp Gly Asp Val Gin Pro llq Ser Thr 
• 595 . 600 • 605 

4 

Gly Asn Val Thr He Pro Thr Asn Phe Thr He Ser Val Gin Val Glu 
610 615 620 



Tyr He Gin Val Tyr Thr Thr Pro Val Ser He Asp Cys Ser Arg Tyr 
625 630 . .635 640 



Val cys Asn Gly Asn Pro. Arg Cys Asn Lys Leu Leu Thr Gin Tyr Val 

645 650 655 



Ser Ala Cys Gin Thr He Glu Gin Ala Leu Ala Met Gly Ala Arg Leu 

• 660 665 670 



Glu Asn Met Glu Val Asp Ser Met Leu Phe Val Ser Glu Asn Ala Leu 
675 680 685 



Lys Leu Ala Ser Val Glu Ala Phe Asn Ser Ser Glu Thr Leu Asp Pro 
. 690 - 695 t 700 ' 



He. Tyr Thr Gin Trp Pro Asn He Gly Gly Phe Trp Leu Glu Gly Leu 
705 710 715 • 720 



Lys Tyr He Leu Pro Ser Asp Asn Ser - Lys Arg Lys Tyr Arg Ser Ala 

725 730 735 



He Glu Asp Leu Leu Phe Ser Lys Val Val Thr Ser Gly Leu Gly Thr 

740 745 750 



Val Asp Glu Asp Tyr Lys Arg Cys Thr Gly Gly Tyr Asp He Ala Asp 
755 760 ' 765 



Leu Val Cys Ala Gin Tyr Tyr Asn Gly He Met Val Leu Pro Gly Val 
770 775 780 



Ala Asn Ala Asp Lys Met Thr Met Tyr Thr Ala Ser Leu Ala Gly Gly 
785 790 795 800 



He Thr Leu Gly Ala Phe Gly Gly Gly Ala Val Ser He Pro Phe Ala 

805 810 815 
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Va3. Ala Val Gin Ala Arg Leu Ash Tyr Val Ala Leu Gin Thr Asp Val 

■ 820 . • » 825 * 830. 



Leu Asn Lys Asn Gin Gin lie. Leu Ala Ser Ala Phe Asn '.Gln Ala lie 
835 840 845 



Gly Asn ile Thr Gin Ser Phe Gly Lys Val Asn Asp Ala lie His Gin 
850 . 855 860 



Thr Ser Arg Gly Leu Thr Thr Val Ala Lys Ala Leu ^ Ala Lys Val. Gin 
8j55 . 870- 875 880 



Asp Val Val Asn Thr Gin Gly Gin Ala Leu Arg His Leu Thr Val. Gin 

885 890 • 895 '* 



Leu Gin Asn Asn Phe Gin Ala lie Ser Ser Ser Ile Ser Asp lie Tyr 

900 905 910 



Asn Arg Leu Asp Glu Leu Ser Ala Asp Ala Gin Val Asp Arg Leu lie 
915 920 925 



Thr Gly Arg Leu Thr Ala Leu Asn Ala -Phe Val Ser Gin Thr Leu Thr • ' 
930 ■ 935 • 940 

Arg Gin Ala Glu Val Arg Ala Ser Arg Gin Leu Ala Lys Asp Lys Val 
945 950 955 960 



Asn Glu Cys Val Arg Ser Gin Ser Gin Arg Phe Gly Phe Cys Gly Asn-. 

9€5 970 975 



@ly Thr His Leu Phe Ser Leu Ala Asn Ala Ala Pro Asn Gly Met lie . 

980 • 985 ' . 990 



Phe. Phe His Thr Val Leu Leu Pro Thr Ala Tyr Glu" Thr Val Thr Ala 
995 1000 1005 



Trp Ser Gly lie Cys Ala Leu Asp Gly Asp hxg Thr Phe Gly Leu 
1010 1015 1020 



Val Val Lys Asp Val Gin Leu Thr Leu Phe Arg Asn Leu Asp Asp 
- 1025 1030 1035 



Lys Phe Tyr Leu Thr Pro Arg Thr Met Tyr Gin Pro Arg Val Ala 
1040 1045 1050 



.Thr Ser Ser Asp Phe Val Gin lie Glu Gly Cys Asp Val Leu Phe 
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1055 1060 -1065 



Val Asn Thr Thr Val Ser Asp Leu Pro Ser lie lie Pro Asp Tyr 
1070 1075 . 1080 



lie Asp He Ash Gin Thr Val Gin Asp He Leu Glu Asn Phe Arg 
1085 1090 1095 | 

Pro Asn Trp Thr Val Pro Glu Leu Thr Leu Asp .Val Phe Asn Ala 
1100 1105 . 1110 



Thr Tyr Leu Asn Leu Thr Gly Glu He Asp Asp Leu Glu Phe Arg. 
1115 . 1120 1125 



Ser Glu Lys Leu His Ash Thr Thr Val Glu Leu Ala He Leu He 
1130 1135 1140 



Asp Asn He Asn Asn Thr Leu Val Asn Leu Glu Trp Leu Asn Arg 
1145 1150 1155 



He Glu Thr Tyr Val Lys Trp Pro Trp Tyr Val. Trp Leu Leu He 
1160 1165 1170 

, Gly Leu Val Val He Phe Cys He Pro Leu Leu Leu Phe Cys Cys 
• 1175 1180 1185 



Cys Ser Thr Gly Cys Cys Gly Cys He Gly Cys Leu Gly Ser Cys 
1190 - 1195 ■ • 1200 



Cys His Ser He Phe Ser Arg Arg Gin Phe Glu Asn Tyr Glu Pro 
1205 1210 . 1215 



He Glu Lys Val His Val His 
1220 1225 



<210> 62 * 
<211> 82 
<212> PRT 

<213> Porcine transmissible gastroenteritis .coronoavirus 
<400> 62 

Met Thr Phe Pro Arg Ala Leu Thr Val He Asp Asp Asn Gly Met Val 
15 10 15 



He Asn He He Phe Trp Phe Leu Leu He He He Leu Xle Leu Leu 

20 25 . 30 • 
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Sex lie Ala Leu Leu Asn lie lie Lys Leu Cys Met Vai Cys Cys Asn 

35 ■ ' . . 40 ' . • .45 



Leu Gly. Arg Thr Val lie lie Val Pro Ala Gin His Ala*Tyr Asp Ala 
50 55 • ' 60 

Tyr Lys Asn Phe Met Arg He Lys Ala Tyr Ash Pro Asp Gly Ala Leu 
65 70 75 80 



Leu Ala 



<210> 63 . • 

<211> 437 6 
<212> PRT 

<213> Severe acute respiratory syndrome virus • 

■ 

<400> 63 ■ * ' . 

ft 

Met Glu Ser Leu Val Leu Gly Val Asn Glu Lys Thr His Val Gin Leu 
1 5 10 • 15 * 



Ser Leu Pro Val Leu Gin Val Arg Asp Val- Leu Val Arg Gly Phe Gly 
.20 25 ' 30 



Asp Ser Val Glu Glu Ala Leu Ser Glu Ala Arg Glu His Leu Lys Asn 
35 40- 45 



Gly Thr Cys Gly Leu Val Glu Leu Glu Lys Sly Val Leu Pro Gin Leu 
50 • 55 . 60 



Glu Gin Pro Tyr Val Phe He Lys Arg Ser Asp Ala Leu Ser Thr Asn 
65 70 75 80 



His Gly His Lys Val Val Glu Leu Val Ala Glu Met Asp Gly He "Gin 

85 90 '95 



Tyr Gly Arg Ser Gly He Thr Leu Gly Val Leu Val Pro His Val Gly 

100 105 110 



Glu Thr Pro He Ala Tyr Arg Asn Val Leu Leu Arg Lys Asn Gly Asn 
115 120 125 



Lys Gly Ala Gly Gly His Ser Tyr Gly He Asp Leu Lys Ser Tyr Asp 
130 135 140 



Leu Gly Asp Glu Leu Gly Thr Asp Pro He Glu Asp Tyr Glu Gin Asn 
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145 150 155 160 



Trp Asn Thr Lys His Gly Ser Gly Ala Leu Arg Glu Leu Thr Arg Glu. 

165 170 175 



Leu Asn Gly Gly Ala Val Thr Arg Tyr Val Asp Asn Asn Phe Cys- Gly 

180 185 190 j 



Pro Asp Gly Tyr Pro Leu Asp Cys lie Lys Asp Phe Leu Ala Arg Ala 
195 200 205 



Gly Lys Ser Met Cys Thr Leu Ser Glu Gin Leu Asp Tyr lie Glu Ser 
210 215 220 



Lys Arg Gly Val Tyr Cys Cys Arg Asp His Glu His Glu lie Ala Trp 
225 230 235 240 



Phe Thr Glu Arg Ser Asp Ly's Ser Tyr Glu His Gin Thr. Pro Phe Glu' 

245 250 255 



He Lys Ser Ala Lys Lys Phe Asp Thr Phe Lys Gly Glu Cys Pro Lys . 

260 265 • 270 • 



Phe Val Phe Pro Leu Asn Ser Lys Val Lys Val He Gin Pro Arg Val 
275 280 285 



Glu Lys Lys Lys Thr Glu Gly Phe Met Gly Arg He Arg Ser Val Tyr 
290 295 300 



Pro Val Ala Ser Pro Gin Glu Cys Asn Asn Met His Leu Ser Thr Leu 
305 310 • . 315 320 



Met Lys Cys Asn His Cys Asp Glu Val Ser Trp Gin Thr Cys Asp Phe 

325 330 335 



Leu Lys Ala Thr Cys Glu His Cys' Gly Thr Glu Asn Leu Val He Glu 

340 345 350 



Gly Pro Thr Thr Cys Gly Tyr Leu Pro Thr Asn Ala Val Val Lys Met 
355 360 365 



Pro Cys Pro Ala Cys Gin Asp Pro Glu He Gly Pro Glu His Ser Val 
370 375 380 

Ala Asp Tyr His Asn His Ser Asn He Glu Thr Arg Leu Arg Lys Gly 
385 390 395 400 
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Gly Arg Thr Arg Cys Ph6 .Gly Gly Cys V^l Phe Ala.Tyr Val .Gly Cys 

405 410 415 



Tyr Asn Lys Arg Ala Tyr Trp Val Pro Arg Ala Ser Ala Asp lie Gly 

420"' • ' 425 . 430 . • 



Ser Gly His Thr Gly He Thr Gly Asp Asn Val Glu Thr lieu Asn Glu 
435 440 ' 445 



Asp Leu Leu Glu He Leu Ser Arg Glu Arg Val Asn He Asn He Val 
450 455 460 



Gly Asp Phe His Leu Asn Glu Glu Val Ala He lie Leu Ala Ser Phe 
465 470* 475 480 



Ser Ala Ser Thr Ser Ala Phe He Asp Thr He Lys Ser Leu Asp Tyr 

485 490 495 



Lys Ser Phe Lys Thr He Val Glu Ser Cys Gly Asn Tyr Lys Val Thr 

• 500 505 510 



Lys . Gly Lys Pro Val Lys Gly Ala Trp Asn He Gly Gin Gin Arg Ser 
515 * ' 520 ■*525 



Val Leu "Thr Pro Leu Cys Gly Phe Pro Ser Gin 7U.a Ala Gly Val He 
530 535 540 



Arg Ser He Phe Ala Arg Thr LeU Asp Ala Ala Asn His Ser He Pro 
545 550 555 560 



Asp Leu Gin Arg Ala Ala Val Thr He Leu Asp Gly He Ser Glu Gin ' 

•565 - 570 575 



Ser Leu Arg Leu Val Asp Ala Met Val. Tyr Thr Ser Asp Leu Leu Thr 

580 -585 590 



Asn Ser Val He He . Met Ala Tyr Val Thr Gly Gly Leu Val Gin Gin 
595 600 605 



Thr Ser Gin Trp Leu Ser Asn Leu Leu Gly Thr Thr Val Glu Lys Leu 
610 • 615 620 



Arg Pro He Phe Glu Trp He Glu Ala Lys Leu Ser Ala Gly Val Glu 
625 630 635 640 
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Phe Leu Lys- Asp Ala Trp Glu He Leu Lys Phe Leu He Thr Gly Val 

645 650 . 655 



Phe Asp He Val Lys Gly Gin He Gin Val Ala Ser Asp Asn He Lys 
. ' 660 - * . 665 670 

• « 

1 

Asp Cys Val Lys Cys Phe He Asp Val Val Asn Lys Ala Leu Glu Met 
675 680 685 • 



Cys He Asp Gin Val. Thr . He Ala Gly Ala Lys Leu Arg Ser Leu Asn 
690 695 700. 



Leu Gly Glu Val Phe Il^e Ala Gin Ser Lys Gly Leu Tyr Arg Gin Cys 
705 . 710 715 720 



He Arg Gly Lys Glu Gin Leu Gin Leu Leu. Met Pro Leu Lys Ala J?ro 

725 730 735 ' 



Lys Glu Val Thr Phe Leu Glu Gly Asp Ser His Asp Thr Val Leu Thr 

740 745 . 750 • 



■ « 



Ser Glu Glu Val Val Leu Lys Asn Gly Glu Leu Glu Ala Leu Glu Thr 
755 760 '765 



• * 



Pro Val Asp Ser Phe Thr Asn Gly Ala He Val Gly Thr. Pro Val Cys 
770 '775 . 780 ■ • 



Val Asn Gly Leu Met Leu Leu Glu He Lys Asp Lys Glu Gin Tyr Cys 
785' ' 790 795 800- 



Ala Leu Ser Pro Gly Leu Leu Ala Thr Asn Asn Val Phe Arg Leu Lys 

805 810 815 



Gly Gly Ala Pro He Lys Gly Val Thr Phe Gly Glu Asp Thr Val Trp 

820 825 830 



Glu Val Gin Gly Tyr Lys Asn Val Arg He Thr Phe Glu Leu Asp Glu 
835 '840 845 



Arg Val Asp Lys Val Leu Asn Glu Lys Cys Ser Val Tyr Thr Val Glu 
850 855 860 



Ser Gly Thr Glu Val Thr Glu Phe Ala Cys Val Val Ala Glu Ala Val 
865 870 875 880 
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Val Lys Thr Leu Gin Pro Val Ser Asp Leu .Leu Thr Asn Met Gly lie 

885 • • 890 895 



Asp Leu Asp. Glu Trp Ser Val Ala Thr .Phe Tyr Leu Phe Asp Asp Ala 

900 ' '905 * '910 

•• . » • • • . 

Gly Glu Glu Asn Phe Ser Ser Arg Met Tyr Cys Ser Phe Tyr Pro Pro 

' • 915 920 '925 



Asp Glu Glu Glu Glu Asp Asp Ala Glu Cys Glu Glu Glu Glu He Asp 
930 935 940 



Glu Thr Cys Glu His Glu Tyr Gly Thr Glu Asp Asp Tyr Gin Gly Leu 
945 '950. • 955 960 



Pro Leu Glu Phe Gly Ala Ser Ala Glu Thr Val Arg Val Glu Glu Glu , 

. 965 970 975 



Glu Glu Glu Asp Trp Leu Asp Asp Thr Thr Glu Gin Ser Glu He Glu 

980 . 985 ' 990 



Pro Glu pro Glu Pro Thr Pro Glu Glu Pro Val Asn Gin Phe Thr Gly 
995 1000 1005 



Tyr Leu Lys Leu Thr Asp Asn Val Ala He Lys Cys Val Asp He 
■1010 1015 1020 



Val Lys Glu Ala Gin Ser Ala Asn Pro Met Val He Val Asn Ala 
1025 1030 1035 



Ala Asn He His Leu Lys His Gly Gly Gly Val Ala Gly Ala Leu 
1040 1045 1050 



Asn Lys Ala Thr Asn Gly Ala Met Gin Lys Glu Ser Asp Asp Tyr 
1055 1060 1065 



He Lys Leu Asn Gly Pro Leu Thr Val Gly Gly Ser Cys Leu Leu 
1070 1075* 1080 



Ser Gly His Asn Leu Ala Lys Lys Cys Leu His Val Val Gly Pro 
1085 1090 1095 



Asn Leu Asn Ala Gly Glu Asp He Gin Leu Leu Lys Ala Ala Tyr 
1100 1105 1110 



Glu Asn Phe Asn Ser Gin Asp He Leu Leu Ala Pro Leu Leu Ser 
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1115 1120 1125 



Ala Gly ' lie Phe Gly Ala Lys Pro Leu Gin Ser Leu Gin Val Cys • . 
1130 1135 1140 



Val Gin Thr Val Arg Thf Gin Val Tyr lie Ala Val Asn Asp Lys 
1145 1150 .1155 .| 

■ 

Ala Leu . Tyr Glu Gin Val Val Met Asp Ty;r Le^ Asp Asn Leu Lys 
1160 1165 1170 

• • • 

\ 

Pro Arg Val Glu Ala Pro Lys Gin Glu Glu Pro Pro Asn Thr. Glu 
' 1175 1180 1185 



Asp Ser Lys Thr Glu Glu Lys- Ser Val Val Gin Lys Pro Val Asp 
1190 1195 1200 ' 



Val Lys Pro Lys lie Lys Ala Cys He Asp Glu Val Thr Thr Thr 
1205 1210 . 1215 . 



Leu Glu Glu Thr Lys Phe Leu - Thr Asn Lys Leu Leu • Leu Phe Ala 
• 1220 • 1225 • 1230 



Asp Tie Asn Gly Lys Leu Tyr His Asp Ser Gin Asn Bet Leu Arg 
1235 1240- 1245 



Gly Glu Asp Met Ser Phe Leu Glu Lys Asp Ala Pro Tyr Met Val 
1250 1255 1260 



Gly Asp Val He Thr Ser Gly Asp He Thr Cys Val Val He Pro 
'I26b 1270 1275 



Ser Lys Lys Ala Gly Gly Thr Thr 'Glu Met Leu Ser Arg Ala Leu 
. 1280 1285 1290 



Lys Lys Val Pro Var Asp Glu Tyr lie Thr Thr Tyr 'Pro Gly Gin . 
1295 1300 1305 



Gly Cys Ala Gly Tyr Thr Leu Glu dlu Ala Lys Thr Ala Leu Lys 
1310 1315 1320 



Lys Cys Lys Ser JUa Phe Tyr Val Leu Pro Ser Glu Ala Pro Asn 
1325 1330 1335 



Ala Lys Glu Glu He Leu Gly Thr Val Ser Trp Asn Leu Arg Glu . 
1340 1345 1350 
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Met Leu Ala His Ala Glu Glu Thr Arg Lys Leu Met Pro lie Cys 
1355 • 1360* 1365 



Met Asp Val Arg Ala lie Met Ala Thr lie Gin Arg Lys Tyr Lys 
1370 1375 1380 



Gly lie Lys lie Gin Glu Gly He Val Asp Tyr Gly Val' Arg Phe 
1385 1390 1395 



Phe Phe Tyr Thr Ser Lys Glu Pro Val Ala Ser He • He Thr Lys 
JL400 1405 1410 



Leu Asn Ser Leu Asn Glu Pro Leu Val Thr Met Pro He Gly Tyr 
1415 1420 . 1425 



Val Thr His Gl'y Phe Asn Leu Glu Glu Ala Ala Arg Cya Met Arg 
-.1430 1435 1440 . 



Ser Leu Lys Ala Pro Ala Val Val Ser Val Ser Ser Pro Asp Ala 
1445 1450 1455 



Val Thr Thr Tyr Asn Gly Tyr Leu Thr Ser Ser Ser" Lys Thr Ser 
' 1460 1465 1470 " 



Glu Glu His Phe Val Glu Thr Val Ser Leu Ala Gly Ser Tyr Arg 
1475 . 1480 1485 



^Vsp Trp Ser Tyr S^r Gly Gin Arg Thr Glu Leu Gly Val Glu Phe 
1490 1495 1500 



Leu Lys Arg Gly Asp .Lys He Val Tyr His Thr Leu * Glu Ser Pro 
1505 1510 1515 



Val Glu Phe His Leu Asp Gly Glu Val Leu Ser Leu Asp Lys Leu 
1520 ■ 1525 1530 



Lys Ser Leu Leu Ser. Leu Arg Glu Val Lys Thr He Lys Val Phe 
1535 • 1540 1545 . 



Thr Thr Val Asp Asn Thr Asn Leu His Thr Gin Leu Val Asp Met 
1550 1555 1560 



* 



Ser Met Thr Tyr Gly Gin Gin Phe Gly Pro Thr Tyr Leu Asp Gly 
1565 1570 1575 
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Ala Asp Val Thr Lys He' Lys Pro His Val Asn His 'Gin Gly Lys 
1580 1585 . . 1590 



Thr Phe Phe Val Leu Pro Ser Asp Asp Thr Leu Arg Ser qiu Ala 

1595 1600 1605 

• . • ■ , • 

Phe Glu Tyr Tyr His Thr .Leu Asp Glu Ser Phe Leu • Gly Arg Tyr 
1610 1615 1620 



Met Ser Ala Leu Asn His Thr Lys Lys Trp Lys Phe Pro Gin Val 
1625 1630 1635 



Gly Gly Leu Thr Ser He Lys Trp Ala Asp Asn Asn Cys Tyr Leu 
1640 1645 . 1650 



Ser Ser Val Leu Leu Ala Leu Gin Gin Leu Glu Val . Lys Phe Asn 
1655 1660 1665 



Ala Pro Ala Leu Gin Glu Ala Tyr Tyr Arg Ala Arg Ala Gly Asp 
1670 1675 . 1680. 



Ala Ala Asn Phe, Cys Ala Leu He Leu Ala Tyr* Ser Asn Lys Thr 



• 1685 



1690 ' 



1695 



Val Gly Glu Leu Gly Asp Val Arg Glu Thr. Met Thr His Leu Leu 
1700 • 1705* 1710 



Gin His Ala Asn Leu Glu Ser Ala Lys Arg Val Leu Asn Val Val 
1715 1720 1725 



Cys Lys His Cys Gly Gin Lys Thr Thr Thr Leu Thr Gly Val Glu 
1730 1735 1740 



Ala Val Met Tyr Met Gly Thr Leu Ser Tyr Asp Asn Leu Lys Thr 
1745 1750 1755 



Gly Val Ser He Pro Cys Val Cy^ Gly Arg Asp Ala Thr Gin Tyr 
1760 1765 1770 



Leu Val Gin Gin Glu Ser Ser Phe Val Met Met Ser Ala Pro Pro 
1775 1780 1785 



Ala Glu Tyr Lys Leu Gin Gin Gly Thr Phe Leu Cys Ala Asn Glu 
1190 1795 1800 



« 
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« 

Tyr Thr Gly Asn Tyr Gin Cys Gly His Tyr Thr-fiis lie Thr Ala 
1805 . -.1810 1815' 



Lys Glu Thr Leu Tyr Arg lie Asp Gly Ala His Leu Thr Lys Met 
1820 1825* 1830 ' 

Ser Glu ' Tyr Lys Gly Pro Val Thr Asp Val Phe Tyy Lys Glu Thr 
1635 1840 ' 1845 



Ser Tyr Thr Thr Thr He Lys Pro Val Ser Tyr Lys Leu Asp Gly 
1850 1855 1860 



Val Thr Tyr Thr Glu He Glu Pro Lys Leu Asp Gly Tyr Tyr Lys 
1865 18'70 1875 



Lys Asp Asn Ala Tyr Tyr Thr Glu Gin Pro He Asp Leu Val Pro 
1880 1885 - 1890 



Thr Gin Pro Leu Pro Asn Ala Ser Phe Asp Asn ?he Lys Leu Thr 
1895 1900 1905 



Cys Ser Asn Thr Lys Phe Ala Asp Asp Leu Asn Gin Met Thr Gly 

1910 1915 1920 

■ ■ 

Phe Thr - Lys Pro Ala Ser Arg Glu Leu Ser Val Thr Phe Phe Pro 

1925 • 1930 1935 



Asp Leu Asn Gly Asp Val Val Ala He Asp Tyr Arg His Tyr Ser 
1940 1945 1950 



Ala Ser Phe Lys Lys Gly Ala Lys Leu Leu His Lys Pro He Val 
1955 1960 . 1965. 



Trp His He Asn Gin Ala Thr Thr Lys Thr Thr Phe Lys Pro Asn 
1970 1975 1980 



Thr Trp Cys Leu Arg Cys Leu Trp Ser Thr Lys Pro Val Asp Thr 
1985 1990 1995 



Ser Asn Ser Phe Glu Val Leu Ala Val Glu Asp Thr Gin Gly Met 
2000 2005 2010 



Asp Asn Leu Ala Cys Glu Ser Gin Gin Pro Thr Ser Glu Glu Val 
2015 2020 2025 



Val Glu Asn Pro Thr He Gin Lys Glu Val He Glu Cys Asp Val 
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* m 

.2030 2035 2040 



Lys Thr Thr Glu Val Val Gly Asn Val He Leu Lys Pro Ser Asp . 
2045 2050 2055 \ 

w • 

Glu (Sly Val Lys Val Thr Gin Glu Leu Gly His Glu Asp Leu ftet 
2060 2065 2070 | 

Ala Ala . Tyr Val Glu- Asn Thr Ser lie Thjc He .Lys Lys Pro Asn 
2075 2080 ' 2085 



Glu Leu Ser Leu Ala Leu Gly Leu Lys Thr He Ala Thr" His. Gly 
2090 2095 . 2100 



He Ala Ala He Asn Ser Val * Pro Trp Ser Lys He Leu Ala Tyr 
2105 • 2110 2115 



Val Lys Pro Phe Leu Gly Gin Ala Ala He Thr Thr Ser Asn Cys ' 
2120 2125 2130 



Ala Lys Arg Leu Ala Gin Arg Val Phe Asn Asn Tyr M^t Pro Tyr 
• 2135 2140 2145 



Val ?he Thr Leu Leu Phe Gin Leu Cys Thr Phe Thr 'Lys Ser Thr 
2150 2155 2160 

) 

f 

Asn Ser Arg He Arg Ala Ser Leu Pro Thr* Thr He Ala Lys Asn 
2165 2170 2175 



Ser Val Lys Ser Val Ala Lys Leu Cys Leu Asp Ala Gly He Asn 
2180 2185 . 2190 



Tyr Val Lys Ser Pro Lys Phe Ser Lys Leu Phe Thr lie Ala Met 
2195 2200 2205 



Trp Leu Leu Leu Leu Ser He Cys Leu Gly Ser Leu He Cys Val 
2210 * 2215 2220 



Thr Ala Ala Phe Gly Val Leu Leu Ser Asn Phe Gly Ala Pro Ser 
2225 2230 2235 



Tyr Cys Asn Gly Val Arg Glu Leu Tyr Leu Asn Ser Ser Asn Val 
2240 2245 2250 



Thr Thr Met Asp Phe Cys Glu Gly Ser Phe Pro Cys Ser He Cys 
2255 2260 2265 
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Leu Ser Gly Leu Asp Sdr. Leu Asp Ser Tyr Pro Ala Leu Glu Thr 
227.0 2275 2280 



rie Gin Val Thr jle Ser Ser Tyr "Lys Leu Asp Leu Thr lie Leu 
2285 •. !•"' . 2290 2295 



Gly Leu Ala Ala Glu Trp Val Leu Ala Tyr Met Leu Phe Thr Lys 
2300 2305 231.0 



Phe* Phe Tyr Leu Leu. Gly Leu Ser Ala lie Met Gin Val P^e Phe 
2315 2320 • 2325 



Gly Tyr Phe Ala Ser His Phe lie Ser Asn Ser Trp Leu Met Trp 
2330 ' 2335 2340 



Phe lie He Ser He. Val Gin Met Ala Pro Val Ser Ala Met Val 
2345 2350 2355 



Arg Met Tyr He Phe Phe Ala Ser Phe Tyr Tyr He Trp Lys Ser 
2360 2365 2370 



Tyr Val His He Met Asp Gly Cys 'Thr Ser Ser Thr Cys Met Met 
23T5 2380 * 2i385 



Cys Tyr Lys Arg Asn Arg Ala Thr Arg Val Glu Cys Thr Thr lie 
2390 - 2395 24O0 



Val Asn Gly Met Lys Arg Ser Phe Tyr Val Tyr Ala Asn Gly Gly 
2405 2410 2415 



Arg Gly Phe Cys Lys Thr His Asn Trp Asin Cys .Leu Asn Cys Asp 
2420 .2425 2430 



1 • 



Thr Phe Cys Thr Gly Ser Thr ' Phe He Ser Asp Glu Val Ala Arg* 
2435 2440 2445 



Asp Leu Ser Leu Gin. Phe Lys Arg Pro He Asn Pro Thr Asp Gin 
2450 2455 2460 



Ser Ser Tyr He Val Asp Ser Val Ala Val Lys Asn Gly Ala Leu 
2465 2470 2475 



His Leu Tyr Phe Asp Lys Ala Gly Gin Lys Thr Tyr Glu Arg His 
2480 2485 2490 . 
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Pro Leu Ser His Phe Val Asn Leu Asp Asn Leu Arg Ala Asn Asn 
2495 2500 2505 



Thr Lys Gly Ser Leu Pro He Asn Val lie Val Phe Asp ,Gly Lys 
2510 2515 2520 



Ser Lys Cys Asp Glu Ser .Ala Ser Lys Ser Ala Ser Val Tyr- Tyr 
2525 2530 2535 



« • 



Ser Gin Leu Met Cys Gin Pro lie Leu Leu Leu Asp Gin Ala Leu 
2540 2545 2550 



Val Ser Asp Val Gly- Asp Ser Thr Glu Val Ser Val Lys Met Phe 
2555 2560 2565 



Asp Ala Tyr Val Asp Thr Phe Ser Ala Thr Phe Ser. Val Pro Met 
2570 2575 2580 



Glu Lys Leu Lys Ala Leu Val Ala Thr Ala His Ser Glu Leu Ala 
2585 2590 2595. 



Lys Gly Val Ala Leu Asp Gly Val Leu Ser' Thr Phe Val Ser Ala 
2600 • • 2605 2610 

■ 

Ala Arg Gin Gly Val Val Asp Thr Asp Val. Asp . Thr Lys Asp Val 
2615 . • 2620 2625 



He Glu Cys Leu Lys Leu Ser His His Ser Asp Leu Glu Val Thr 
2630 2635 2640 



Gly Asp Ser Cys Asn Asn Phe Met Leu Thr Tyr Asn Lys Val Glu 
2645 . 2650 2655 



Asn Met Thr Pro Arg Asp Leu Gly Ala Cys He Asp Cys Asn Ala 
2660 2665 2670 



Arg His He Asn Ala Gin Val Al^ Lys Ser His Asn Val Ser Leu 
2675 2680 2685 



He Trp Asn Val Lys Asp Tyr Met Ser Leu Ser Glu Gin Leu Arg 
2690 2695 2700 



Lys Gin He Arg Ser Ala Ala Lys Lys Asn Asn He Pro Phe Arg 
2705 2710 2715 



t • 
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Leu Thr Cyis Ala Thr Thr Arg Gin Val Val Asn . Val lie Thr .Thr 
2720 2725 2730* 



Lys lie Ser Leu Lys Gly Gly .Lys lie Val Ser Thr Cys Phe Lys 
2735 .• 2740' 2745 



Leu' Met Leu Lys Ala Thr Leu Leu Cys Val Leu Ala Ala Leu Val 
2750 2755' " 2760 



Cys Tyr He Val Met Pro Val His Thr Leu S^r He His Asp Gly 
2765 2770 2775 



Tyr Thr Asn Glu He He Gly Tyr Lys Ala He Gin Asp Gly Val 
2780 27€5 2790 



Thr TVrg .Asp He He Ser Thr Asp Asp Cys Phe Ma Asn Lys His 
2795 . 2800 2805 



Ala Gly Phe Asp Ala Trp Phe Ser Gin Arg Gly Gly Ser Tyr Lys 
2810 2815 . 2820 



Asn Asp Lys Ser Cys Pro Val Val Ala Ala 'He He Thr Arg Glu 
2825 2830 2835 

He Gly Phe He Val Pro Gly Leu Pro Gly Thr Val Leu Arg Ala 
2840 2845 2850 



lie Asn Gly Asp Phe Leu His Phe Leu Pro Acg Val Phe Ser Ala 
2855 2860 2865 



Val Gly Asn He Cys Tyr Thr Pro Ser Lys Leu He Glu Tyr Ser 
2870 2875 2880 



Asp Phe Ala Thr Ser Ala Cys Val Leu Ala Ala Glu Cys Thr He 
2885 2890 2895 



Phe Lys Asp Ala Met Gly Lys Pro Val Pro Tyr Cys Tyr Asp Thr' 
2900 2905 2910 



Asn Leu Leu Glu Gly Ser He Ser Tyr Ser Glu Leu Arg Pro Asp 
2915 2920 2925 



Thr Arg Tyr Val Leu Met Asp Gly Ser He He Gin Phe Pro Asn 
2930 2935 2940 



Thr Tyr Leu Glu Gly Ser Val Arg Val Val Thr Thr Phe Asp Ala 
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2945 2950 2955 



Glu Tyr Cys Arg His Gly Thr Cys Giu Arg Ser Glu Val Giy lie 
2960 2965 2970 



Cys Leu Ser Thr Ser Gly Arg Trp Val .Leu Asn Asn Glu Hi« Tyr 
2975 2980 * .2985 | 

* * ■ 

i 

Arg Ala Leu Ser Gly Val Phe Cys Gly Val Asp. Ala Met Asn Leu 
2990 2995 . 3000 



lie Ala Asn lie Phe Thr Pro Leu Val Gin Pro Val Gly Ala Leu 
3005 3010 3015 



Asp Val Ser Ala Ser Val Val Ala Gly Gly He' He Ala He Leu 
3020. 3025 3030" 



Val Thr Cys Ala Ala Tyr Tyr Phe Met Lyd Phe Arg Arg Val Phe 
3035 3040 * 3045 



Gly Glu Tyr Asn His Val Val Ala Ala Asn Ala Leu Leu Phe Leu 
3050 3055 3060 

• • • 

• * • • • * 
Met Ser Phe. Thr He Leu Cys Leu Val Pro Ala Tyr Ser Phe Leu 
3065 ' 3070 3075 



Pro Gly Val Tyr Ser Val Phe Tyr Leu Tyr Leu Thr Phe Tyr Phe 
3080. 3085 ■ 3090 



Thr Asn Asp Val Ser Phe Leu Ala His Leu Gin Trp Phe Ala Met 
3095 3100 , 3105 



Phe Ser Pro He Val Pro Phe Trp He Thr Ala He Tyr Val Phe 
3110 ■ 3115 3120 



Cys He Ser Leu Lys His Cys His Trp Phe Phe Asn Asn Tyr Leu 
3125 3130 3135 



Arg Lys Arg Val Met Phe Asn Gly Val Thr Phe Ser Thr Phe Glu 
3140 3145 3150 



Glu Ala Ala Leu Cys Thr Phe Leu Leu Asn Lys Glu Met Tyr Leu 
3155 3160 3165 



Lys Leu Arg Ser Glu Thr Leu Leu Pro Leu Thr Gin Tyr Asn Arg 
3170 3175 3180 
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Tyr Leu Ala Leu Tyr Asn Lys ,Tyr Lys Tyr Phe Ser Gly Ala Leu 
3185 3190 ' 3195 



Asp Thr Thr Ser. Tyr Arg Glu Ala Ala Cys Cys His. Leu Ala Lys 
3200 3205 3210 • 



Ala Leu Asn Asp Phe . Ser Asn Ser Gly Ala Asp Val Leu Tyr Gin 
3215 3220 3225 



Pro Pro Gin Thr Ser. He Thr Ser Ala Val Leu Gin Ser Qly Phe 
3230 3235 3240 



Arg Lys Met Ala Phe Pro Ser Gly Lys Val Glu Gly Cys Met Val 
3245 3250 3255 



Gin Val Thr Cys Gly Thr Thr Thr Leu Asn Gly Leu Trjp Leu Asp 
3260 3265 3270 



Ajbp Thr Val Tyr Cys Pro Arg His Val lie Cys Thr Ala Glu Asp 
3275 3280 3285 



Met* Leu. Asn Pro Asn Tyr Glu Asp Leii Leu He Arg Lys Ser Asn 
3290 3295 3300 * 



His Ser Phe L6u Val Gin Ala Gly Asn Val Gin Leu Arg Val lie 
3305 3310 3315 



Qly His Ser Met Gin Asn Cys Leu Leu Arg Leu Lys Val Asp Thr 
3320 3325 3330 



Ser Asn Pro Lys Thr Pro Lys Tyr Lys Phe Val Arg lie Gin Pro 
3335 3340 3345 



Gly Gin Thr Phe S^r Val Leu Ala Cys Tyr- Asn Gly Ser Pro Ser 
3350 3355 3360 . 



Gly Val Tyr Gin Cys Ala Met Arg Pro Asn His Thr He Lys Gly 
3365 3370 3375 



Ser Phe Leu Asn Gly Ser Cys Gly Ser Val Gly Phe Asn He Asp 
3380 3385 3390 



Tyr Asp Cys Val Ser Phe Cys Tyr Met His His Met GIm Leu Pro 
3395 3400 3405 
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Thr Gly ' Val His Ala Gly Thr Asp Leu Glu Gly.Lys- Phe Tyr Gly . 
3410 . 3415 • 3420 



Pro Phe- Val Asp Arg Gin Thr 'Ala Gin Ala Ala Gly Thr Asp Thr 
3425 . 3430 3435 

t 

Thr He Thr Leu Asn Val Leu Ala .Trp Leu Tyr Ala Ala Val lie 
3440 ' 3445 3450 



Asn Gly Asp Arg Trp Phe Leu Asn Arg Phe Thr Thr Thr Leu Asn 
- 3455 3460 3465 



Asp Phe Asn Leu Va! Ala Met Lys Tyr Asn Tyr Glu Pro Leu Thr 
3470 3475 3480 



Gin Asp His Val Asp He Leu Gly Pro Leu Ser- Ala Gin Thr Gly 
3485 3490 ' 3495 



He Ala Val Leu Asp Met Cys Ala Ala Leu Lys Glu Leu Leu Gin ' 
3500 3505 3510 



Asn Gly Met Asn Gly Arg Thr He Leu Gly S.er Thr He Leu Glu 
3515 3520 I 3525 . 



Asp . Glu Phe Thr Pro Phe Asp Val Val Arg Gin Cys Ser Gly Val 
3530 3535 3540 ' 



Thr Phe Gin Gly Lys Phe Lys Lys He Val Lys Gly Thr His His 
354,5 . 3550 ' 3555 



Trp Met Leu Leu Thr Phe Leu Thr Ser Leu Leu He Leu Val Gin 
3560 3565 3570 



Ser Thr Gin Trp Ser Leu Phe Phe Phe Val Tyr Glii Asn Ala Phe 
3575 3580* ' 3585 



Leu Pro Phe Thr Leu Gly He ' Met Ala He Ala Ala Cys Ala Met 
3590 3595 3600 



Leu Leu Val Lys His Lys His Ala Phe Leu Cys Leu Phe Leu Leu 
3605 3610* 3615 



Pro Ser Leu Ala Thr Val Ala Tyr Phe Asn Met Val Tyr Met Pro 
3620 3625 3630 
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Ala Ser Trp Val Met Arg He Met Thr Trp Leu Glu Leu Ala Asp 
3635 3640 3645* 



Thr Ser Leu Ser Gly Tyr Arg lieu Lya Asp Cys Val Met Tyr Ala 
3650 3655 3660 



Ser Ala Leu Val Lea Leu He Leu Met Thr Ala Arg Thr Val Tyr ' 
3665 .3670 3675 



Asp Asp Ala Ala Arg Arg Val Trp Thr Leu Met Asn Val He Thr 
3680 3685 3690 



Leu Val Tyr.Lys Val Tyr Tyr Gly Asn Ala Leu Asp Gin Ala He 
3695 3700 • 3705 



Ser Met Trp Ala Leu Val He Ser Val Thr Ser Asn Tyr Ser Gly 
3710 3715 3720 



Val Val Thr Thr lie Met Phe Leu Ala Arg Ala He Val Phe Val 
3725 3730 3735 



Cys Val Glu Tyr Tyr Pro Leu * Leu Phe He Thr Gly Asn Thr Leu 
3740 3745 ' 3750 



Gin Cys He Met Leu Val Tyr Cys Phe Leu Gly Tyr Cys Cys Cys 
3755 3760 .3765 



Cys Tyr Phe Gly Leu Phe Cys Leu Leu Asn Arg Tyr Phe Arg Leu 
3770 3775 3780 



Thr Leu Gly Val Tyr Asp Tyr Leu Val Ser Thr Gin Glu Phe Arg 
3785 3790 3795 



Tyr . Met Asn Ser Gin Gly Leu Leu Pro Pro Lys Ser Ser He Asp 
3800 3805 3810 



Ala Phe Lys Leu Asn. He Lys Leu Leu Gly He Gly Gly Lys Pro 
3815 3820 3825 



Cys He Lys Val Ala Thr Val Gin Ser Lys Met Ser Asp Val Lys 
• 3830 3835 3840 



Cys Thr Ser Val Val Leu Leu Ser Val Leu Gin Gin Leu Arg Val 
3845 3850 3855 



Glu Ser Ser Ser Lys Leu Trp Ala Gin Cys Val Gin Leu His Asn 
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3860 3865 3870 



Asp rie Leu Leu Ala Lys Asp Thr Thr Glu Ala Phe Glu Lys Met 
3875 3880 3885 



Val Ser Leu Leu Ser Val Leu Leu Ser .Met Gin Gly Ala Val Asp 
3890 3895 3900 j 

4 

lie Asn Arg Leu Cys Glu Glu Met Leu Asp Asn Arg Ala Thr Leu 
3905 . 3910 . 3915 * 



Gin Ala lie Ala Ser Glu Phe Ser Ser Leu Pro Ser Tyr Ala Ala 
3920 3925 3930 



Tyr Ala Thr Ala Gin Glu Ala Tyr Glu Gin Ala Val Ala Asn Gly 
3935, 3940 3945 



Asp Ser Glu Val Val Leu Lys Lys Leu Lys Lys Ser Leu Asn Val 
3950 3955 3960 



Ala Lys Ser Glu Phe Asp Arg Asp Ala Ala Met Gin Arg Lys. Leu 
3965 3970 3975 

Glu Lys Met Ala Asp Gin Ala Wet Thr Gin Met Tyr l^ys Gin Ala 
3980 , 3985 3990 



Arg Ser * Glu Asp Lys Arg Ala Lys Val Thif Ser Ala Met Gin Thr 
3995 4000 4005 



Met Leu Phe Thr Met Leu Arg Lys Leu Asp Asn Asp Ala Leu Asn 
4010 4015 4020 



Asn lie lie Asn Asn Ala Arg Asp Gly Cys Val Pro Leu Asn He 
4025 4030 ' 4035 



He Pro Leu Thr Thr Ala Ala Lys Leu Met Val Vai Val Pro Asp 
4040 4045 4050 



Tyr Gly Thr Tyr Lys Asn Thr Cys Asp Gly Asn .Thr Phe Thr Tyr 
4055 4060 4065 



Ala Ser Ala Leu Trp Glu He Gin Gin Val Val Asp Ala Asp Ser 
4070 4075 4080 



Lys He Val Gin Leu Ser Glu He Asn Met Asp Asn Ser Pro Asn 
4085 4090 4095 
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Leu Ala Trp Pro Leu lie Val Thr Ala Leu Arg Ala Asn Ser Ala 
4100 • 4105 4110 



Val Lys Leu Gin, Asn Ash 61u ,Leu Ser Pro Val Ala Leu Arg Gin 
4115 4120 4125 



Met Ser Cys Ala Ala Gly Thr Thr Gin Thr Ala Cys Thr Asp Asp 
4130 4135 • 4140 



Asn Ala Leu Ala Tyr. Tyr Asn Asn Ser Lys Gly Gly Arg Phe Val 
4145 4150 4155 



Leu Ala Leu Leu Ser Asp His Gin Asp Leu Lys Trp Ala Arg Phe 
4160 4165 4170 



Pro Lys Ser Asp Gly Thr Gly Thr lie Tyr Thr Glu Leu Glu Pro 
4175 4180 4185 



Pro Cys Arg Phe Val Thr Asp fhr Pro Lys Gly Pro . Lys Val Lys' 
4190 4195 4200 



Tyr Leu Tyr Phe lie Lys Gly Leu Ash Asn Leu Asn Arg Gly Met 
^'205 4210 4215 



Val Leu Gly Ser Leu Ala Ala. Thr Val Arg Leu Gin Ala Gly Ash 
4220 r 4225 4230 



j^la Thr Glu Val Pro Ala Aisn Ser Thr Val Leu Ser Phe Cys Ala 
4235 4240 4245 



Phe Ala Val Asp Pro Ala Lys Ala Tyr Lys Asp Tyr * Leu Ala Ser 
4250 4255 4260 



GXy Gly Gin Pro lie Thr Asn Cys Val Lys Met Leu Cys Thr His 
4265 ' 4270 4275 



Thr Gly Thr Gly Gin Ala lie Thr Val Thr Pro Glu Ala Asn Met 
4280. ' 4285 4290 



Asp Gin Glu Ser Phe Gly Gly Ala Ser Cys Cys Leu Tyr Cys Arg 
4295 4300 4305 



Cys His lie Asp His Pro Asn Pro Lys Gly Phe Cys Asp Leu Lys 
4310 4315 4320 



162 



wo 2004/096842 PCT/CA2004/000626 



Gly Lys Tyr Val Gin lie' Pro Thr Thr Cys Ala Asn Asp Pro Val 
4325 4330 4335 



Gly Phe Thr Leu Arg Asn Thr Val CysThr Val -Cys Gly. Met Trp 

4340 4*345 4350 

' - .* • ■ 

Lys Gly • Tyr- Gly Cys Ser.Cys Asp Glri Leu Arg Glu Pro Leu Met- 
• 4355 4360 4365 



Gin Ser Ala Asp Ala Ser Thr Phe 
4370 4375 



<210> 64 
<211> 2697 
<212> PRT 

<213> , Severe acute respiratory syndrome virus 
<400> 64 

• * • * 

Phe Lys Arg Val Cys Gly Val Ser Ala Ala Arg Leu- Thr Pro Cys Gly 
1 -5 .10 15 



Thr Gly' Thr Ser Thr Asp Val Val Tyr -Arg Ala Phe Asp He Tyr Asn • 

20 ■ . 25 ■ 30 



Glu Lys Val Ala Gly Phe Ala Lys Phe Leu Lys Thr Asn Cys Cys Arg 
35 40 .* 45 



Phe Gin Glu Lys Asp Glu Glu Gly Asn Leu Leu Asp Ser Tyx Phe Val 
50 55 60 



Val Lys Arg His Thr Met. Ser Asn Tyr Gin His Glu Glu Thr lie Tyr 
65 70 • 75 80 



Asn Leu Val Lys Asp Cys Pro Ala Val Ala Val His Asp Phe Phe Lys 

85 '90 '95 



Phe Arg Val Asp Gly Asp Met Val Pro His He Ser Arg Gin Arg Leu 

100 105 110 



Thr Lys Tyr Thr Met Ala Asp Leu Val Tyr Ala Leu Arg His Phe Asp 
115 120. 125 



Glu Gly Asn Cys Asp Thr Leu Lys Glu He Leu Val Thr Tyr Asn Cys 
130 135 140 



Cys Asp Asp Asp Tyr Phe Asn Lys Lys Asp Trp Tyr Asp Phe Val Glu 
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145 150 .155 ' • * 160 

* 

Asn Pro Asp lie Leu Arg Val Tyr Ala Asn Leu Gly Glu Arg Val Arg 

165 ' .170 175 

* 

Gin Ser Leu Le6"'Lys Thr Val Gin Phe Cys Asp Ala Met Arg Asp Ala 

180 185 190 . 



Gly He Val Gly Val Leu Thr Leu Asp Asn Gin Asp Leu Asn Gly Asn 
195 200 205 



Trp Tyr Asp Phe Gly Asp Phe Val Gin Val Ala Pro Gly Cys Gly Val 
210 .215 • 220 



4tf 

Pro He Val Asp Ser Tyr Tyr Ser Leu Leu Met Pro He Leu Thr Leu 
225 230 235 * 240 



Thr Arg Ala Leu Ala Ala Glu Ser His Met Asp Ala Asp Leu Ala Lys 

245 250 255 



Pro Leu He Lys Trp Asp Leu Leu Lys Tyr Asp Phe Thr Glu Glu Arg 

260 ' . 265 ■ 270* 



Leu Cys Leu Phe Asp Arg Tyr Phe Lys Tyr Trp Asp Gin Thr Tyr His 
275 280 285 



Pro Asn Cys He Asn Cys Leu Asp Asp Arg Cys He Leu His Cys Mlb 
290 295 -300 



Asn Phe Asn Val Leu Phe Ser Thr Val Phe Pro Pro Thr Ser Phe Gly 
305 • 310 315 320 



Pro Leu Val Arg Lys He Phe Val Asp Gly Val Pro Phe Val- Val Ser 

325 330 335 



Ti:\r Gly Tyr His Phe Arg Glu Leu Gly Val Val His Asn Gin Asp Val 

340 345 350 



Asn Leu His Ser Ser Arg Leu Ser Phe Lys Glu Leu Leu Val Tyr Ala 
355 '360 365 



Ala Asp Pro Ala Met His Ala Ala Ser Gly Asn Leu Leu Leu Asp Lys 
370 375 380 



Arg Thr Thr Cys Phe Ser Val Ala Ala Leu Thr Asn Asn Val Ala Phe 
385 390 395 400 
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Gin Thr Val Lys Pro Gly Asn Phe Asn Lys Asp Phe Tyr Asp Phe Ala 

. 405 410 415 



Val Ser Lys Gly Phe Phe* Lys Glu Gly Ser Ser Val Glu Leu Lys His 

420- 425 430* 

• • * 

* * ' . • - • 

Phe Phe Phe Ala Gin Asp Gly Asn Ala Ala lie Ser Asp Tyr Asp Tyr 
.435 4.40 . 445 



Tyr Arg Tyr Asn- Leu Pro Thr Met Cys Asp He Arg, Gin Leu .Leu ^>he 
450 455 460 



Val Val Glu Vail Val Asp. Lys Tyr Phe Asp Cys Tyr Asp Gly Gly Cys 
465 470 475 480 



He Asn Ala Asn Gin Val He Val Asn Asn Leu Asp Lys Ser Ala Gly 

485 '490 495 



Phe Pro Phe Asn Lys Trp Gly Lys Ala Arg Leu Tyr Tyr Asp Ser Met 

500 • 505 510 



Ser Tyr Glu Asp Gin Asp Ala Leu Phe Ala Tyr Thr Lys Arg Asn Val 
k' .515 520 525 



He Pro Thr He Thr Gin Met Asn Leu Lys Tyr Ala He Ser Ala Lys 
530 535 540* 



fksn Arg Ala Arg Thr Val Ala Gly Val Ser He Cys Ser Thr Met Thr 
545 550 555 560 



Asn Arg Gin Phe His Gin. Lys Leu Leu Lys Ser He Ala Ala Thr Arg 

565 570 ' 575 



Gly Ala Thr Val Val He. Gly Thr Ser Lys Phe Tyr Gly Gly Trp His 

580 585 590 



Asn Met Leu Lys Thr Val Tyr Ser Asp Val Glu Thr Pro His Leu Met 
595 ■ 600 • ' 605 . 



Gly Trp Asp Tyr Pro Lys Cys Asp Arg Ala Met Pro Asn Met Leu Arg 
610 615 620 



He Met Ala Ser Leu Val Leu Ala Arg Lys His Asn Thr Cys Cys Asn 
625 630 635 640 
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Leu Ser His Arg Phe Tyr Arg Leu Ala Asn Glu Cys' Ala Gin Val Leu 

645 . 650 . 655 ♦ 



S6r Glu. Mett Val Met Cys Gly Gly Ser Leu Tyr Val Lys * Pro Gly Gly 

660 . / 665 . 670 • 

Thr Ser Ser Gly Asp Ala Thr Thr Ala Tyr Ala' Asn Ser Val Phe Asn 
675 680 . ' 685 



lie Cys* Gin Ala Val. Thr Ala Asn Val Asn Ala Leu Leu Ser Thr Asp 
* 690 695 700 



Gly A3n Lys He Ala Asp Lys' Tyr Val -Arg Asn Leu Gin His Arg Leu 
705 710 . 715 720 



Tyr Glu Cys Leu Tyr Arg Asn Arg 'Asp Val Asp His Glu Phe Val Asp 

725 730 735 



Glu Phe Tyr Ala Tyr Leu Arg Lys His Phe Ser Met Met He Leu Ser 

740 . 745 750 



Asp Asp Ala Val Val Cys Tyr Asn Ser Asn Tyr Ala Ala Gin Gly Leu 
.755 760 . • 765 



Val Ala Ser He Lys Asn Phe Lys Ala Val Leu Tyr Tyr Gin Asn Asn 
. 770 * ' 775 780 • * 



Val Phe Met Ser Glu Ala Lys Cys Trp -Thr Glu Thr Asp Leu Thr Lys 
785 ' 790 795 . 800 



Gly Pro His Glu Phe Cys Ser Gin His Thr Met Leu Val Lys <31n Gly 

805 810 815 



Asp Asp Tyr Val Tyr Leu Pro Tyr Pro Asp Pro Ser Afg lie Leu Gly 

820 825 830 



Ala Gly Cys Phe Val Asp Asp He Val Lys Thr Asp Gly Thr Leu Met 
835 840 845 ' 



He Glu Arg Phe Val Ser Leu Ala He Asp Ala Tyr Pro Leu Thr Lys 
850 855 860 



His Pro Asn Gin Glu Tyr Ala Asp Val Phe His Leu Tyr Leu Gin Tyr 
865 879 875 880 
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lie Arg Lys Leu Hi5. Asp Glu Leu Thr Gly His Met Leu Asp Met Tyr 

885 - 890 . ■ : • 895 

Ser Val Met Leu Thx Asn Asp. Asn Thr Ser Arg Tyr Trp '.Glu Pro Glu 

900 905 910 

* 

Phe Tyr Glu Ala Met Tyr Thr Pro His Thr Val Leu Gin Alb Val Gly 
915 ■. • 920 925 . . • 

* 

Ala Cys Val Leu Cy5 Asn Ser Gin Thr '. Ser Leu Arg - Cys Gly Ala Cys 
930 .935 940 ' 



He Arg Arg Pro Phe Leu Cys Cys Lys Cys Cys Tyr Asp His Val He 
945 . 950 955 960 



Ser Thr Ser His hye Leu Val Leu Ser Val Asn Pro Tyr Val Cys Asn 

965 970 975 . ' 



Ala Pro Gly Cys Asp Val Thr Asp Val Thr Gin Leu Tyr Leu Gly Gly • 

980 985 990 



Met Ser Tyr Tyr Cys Lys Ser His Lys- Pro Pro lie Ser Ph'e Pro Leu • 
" 995 • 1000 1005 



Cys Ala Asn Gly Gin Val Phe Gly Leu Tyr Lys Asn Thr Cys Val 
1010 1015 .1020 . 



Gly Ser Asp Asn Val Thr Asp Phe Asn Ala lie Ala Thr Cys Asp 
.1025 1030 1035 



Trp Thr Asn Ala Gly Asp Tyr He Leu Ala Asn Thr Cys Thr Glu 
1040 1045 1050 



Arg. Leu Lys Leu Phe Ala Ala Glu Thr Leu Lys Ala Thr Glu Glu 
1055 1060 1065 



Thr Phe Lys Leu Ser. Tyr Gly He Ala Thr Val Arg Glu Val Leu 
1070 1075 1080 



Ser Asp Arg Glu Leu His Leu Ser Trp Glu Val Gly Lys Pro Arg 
1085 1090 1095 



Pro Pro Leu Asn Arg Asn Tyr Val Phe Thr Gly Tyr Arg Val Thr 
1100 1105 1110 



Lys Asn Ser Lys Val Gin He Gly Glu Tyr Thr Phe Glu Lys Gly 
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. 1120 



1125 



Asp Tyr Gly Asp Ala Val Val Tyr Arg Gly Thr Thr Tbr Tyr Lys 
1130 ' 1135. 1140 



Leu Asn Val Gl'y'Asp Tyr Phe Val Leu .Thr Ser His * Thr Val Met . 



1145' 



1150 



1155 



Pro Leu Ser Ala Pro Thr Leu Val Pro Gin Glu His Tyr Val Arg 
1160 1165 . 1170 * 



He Thr Gly Leu Tyr Pro Thr Leu Asn He Ser Asp Glu Phe Ser 
1175 . 1180 1185 



Ser Asn Val Ala Asn Tyr Gin Lys Val Gly Met Gin Lys Tyr Ser 
1190 1195 1200 



Thr Leu Gin Gly Pro Pro Gly Thr Gly Ly^ Ser His Phe Ala He 
1205 1210 1215 



Gly Leu Ala Leu Tyr Tyr Pro Ser Ala Arg He Val Tyr Thr Ala 
1220 1225 1230 



Cys Ser His Ala Ala Val Asp Ala Leu Cys Glu Lys Ala Leu iys • 
1235 1240 12.45 



Tyr Leu ' Fro He Asp Lys Cys Ser Arg H6 He Pro Ala Arg Ala 
1250 . 1255 • 1260 



Arg Val Glu Cys Phe Asp Lys Phe Lys Val Asn Ser Thr Leu Glu 
1265 1270 1275 



Gin Tyr Val Phe Cys Thr Val Asn Ala Leu Pro Glu Thr Thr Ala 
1280 1285 1290 



Asp He ' Val Val Phe Asp Glu He Ser Met Ala Thr Asn Tyr Asp 
1295 1300 1305 



Leu Ser. Val Val Asn Ala Arg Leu Arg Ala Lys His Tyr Val Tyr 
1310 1315 1320 



He Gly Asp Pro Ala Gin Leu Pro Ala Pro Arg Thr Leu Leu Thr 
1325 1330 1335 



Lys Gly Thr Leu Glu Pro Glu Tyr Phe Asn Ser Val Cys Arg Leu 
1340 1345 1350 
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Met Lys Thr lie <31y Pro. Asp Met Phe Leu Gly Thr Cya hxQ Arg 
1355 1360 I3e5 



Cys Pro Ala Glu ,Ile Val Asp Thr Val Ser Ala Leu Val Tyr Asp 
1370 . 1375 1380 ' 

I 

Asn Lys Leu Lys Ala His Lys Asp Lys Ser Ala Gin Cys Phe Lys 
1385 1390 . 1395 



Met Phe Tyr Lys Gly Val lie Thr His Asp Val Ser Ser Pda lie 
1400. 1405 1410 



.Asn Arg Pro Gin lie Gly Val Val Arg Glu Phe Leu 'Thr Arg Asn 
1415 • 1420 1425 



Pro Ala Trp Arg Lys. Ala Val Phe lie Ser Pro Tyr Asn Ser Gin 
1430 1435 1440 



Asn Ala Val Ala Ser Lys lie Leu Gly Leu Pro. Thr Gin Thr' Val 
. 1445 1450 1455 



Asp Ser Ser Gin Gly Ser Glu Tyr «Asp Tyr Val He Phe Thr Gin 
1460 ; .1465 . 1470 

Thr Thr Glu Thr Ala His Ser .Cys Asn Val Asn Arg- Phe Asn Val 
1475 1480 1485 

« 

Ala He Thr Arg Ala Lys lie Gly He Leu Cys He Met Ser Asp 
• 1490 1495 1500 



Arg Asp Leu Tyr Asp Lys Leu Gin Phe Thr Ser Leu Glu He Pro 
' 1505 1510 1515 



Arg Arg Asn Val Ala Thr Leu Gin Ala Glu Asn Val Thr Gly Leu 
1520 • 1.525 1530 



Phe Lys Asp Cys Ser Lys He He Thr Gly Leu His Pro Thr Gin 
1535 * • 1540 1545 



Ala Pro Thr His Leu Ser Val Asp He Lys Phe Lys Thr Glu Gly 
1550 1555 1560 

Leu Cys Val Asp He Pro Gly He Pro Lya Asp Met Thr Tyr Arg 
1565 1570 1575 
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Arg Leu lie Sex Met Met »Gly Phe Lys Met Asn Tyr Gin Val Asn 
1580 . 1585 1590 



Gly Tyr Pro Asn Met Phe He Thr Arg Glu Glu Ala lie Arg His 
1595 • 1600 1603 

Val Arg Ala Trp He Gly .Phe Asp Val Glu Gly Cys His Ala Thr 
1610 1615 •. 1620 



Arg Asp Ala Val Gly Thr Asn Leu Pro Leu Gin Leu Gly Phe Ser 
1625 1630 1635 



Thr Gly Val Asn Leu Val Ala Val Pro Thr Gly Tyr Val Asp Thr 
1640 1645 1650 



Glu Asn Asn Thr Glu Phe Thr Arg Val Asn Ala Lys. Pro Pro Pro ' 
1655 * 1660 1665 



Gly Asp Gin Phe Lys His Leu He Pro Leu Met Tyr Lys Gly Leu 
1670 1675 1680- 



Pro Trp Asn Val Val Arg He Lys lie Val Gin- Met Leu. Ser Asp 
1685 1690 • 1695- 



Thr Leu Lys Gly Leu Ser Asp Arg Val Val Phe - Val Leu Trp Ala 
1700 . 1705 1710 



His Gly. Phe- Glu Leu Thr Ser Met Lys Tyr Phe Val Lys He Gly 
1715 1720 1725 



Pro Glu Arg Thr Cys Cys Leu Cys Asp Lys Arg Ala Thr Cys J?he 
1730 1735 1740 



Ser Thr Ser Ser Asp Thr Tyr Ala Cys Trp Asn His Ser Val Gly 
1745 1750 1755 



Phe Asp Tyr Val Tyr Asn Pro Phe Met He Asp Val Gin Gin Trp ' 
1760 1765 1770 



Gly Phe Thr Gly Asn Leu Gin Ser Asn His Asp Gin His Cys Gin 
1775 1780 1785 



Val His Gly Asn Ala His Val Ala Ser Cys Asp Ala lie Met Thr 
1790 1795 L8Q0 
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Arg Cys Leu Ala Val His Glu Cys Phe Val Lys -Arg Val Asp -Trp 
1805 • . . . 1810 , . 1815 * 

• * 

• . . • . • . 

m 

» m ^ 

Ser Val Glu Tyr Pro lie lie .Gly A^p Glu Leu Arg Val Asn Ser 

* 1820 1825' 1830 * 

Ala Cys Arg Lys Val Gin His Met Val Val Lys Ser Ala Leu Leu ■ 
1835 . 1640 1845 ' 

• > « 

Ala Asp Lys Phe Pro Val Leu His Asp lie Gly Asn Pro Lys Ala 
1850 1855 1860 

■ 

He Lys Cys Val Pro Gin Ala Glu Val Glu Trp Lys Phe Tyr Asp 
1865 1870 1875 

Ala Gin .Pro Cys Ser Asp Lys Ala Tyr Lys He Glu Glu Leu Phe 
1880 . . 1885 . 1890 

Tyr Ser Tyr Ala Thr His His Asp Lys Phe Thr Asp Gly Val Cys 
1895 1900 1905 



Leu Phe Trp Asn Cys Asn Val Asp Arg Tyr .Pro Ala Asn Ala He 
1910 1915 1920 

Val Cys Arg Phe Asp Thr Arg Val Leu Ser Asn Leu Asn Leu Pro 
1925 • 1930 • 1935 



Gly Cys Asp Gly Gly Ser Leu Tyr Val Asn Lys His Ala Phe His 
1940 1945 . 1950 



Thr Pro Ala Phe Asp Lys Ser Ala Phe Thr Asn Leu Lys Gin Leu 
1955 I960 1965 



Pro Phe Phe Tyr Tyr Ser Asp Ser Pro Cys Glu Ser His Gly Lys 
1970 1975 1980 



Gin Val Val Ser Asp He Asp Tyr Val Pro Leu Lys Ser Ala Thr* 
1985 1990 1995 



Cys He Thr Arg Cys Asn Leu Gly Gly Ala Val Cys Arg His His 
2000 2005 2010 



Ala Asn Glu Tyr Arg Gin Tyr Leu Asp Ala Tyr Asn Met Met He 
2015 2020 2025 



Ser Ala Gly Phe Ser Leu Trp He Tyr Lys Gin Phe Asp Thr Tyr 
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2035 



2040 



Asn Leu Trp Asn Thr i?he Thr Arg ieu Gin Ser Leu Glu Asn Val • . 
2045 * , ' 2050 • • 2055 



Ala Tyr -. Asn Vil Val Asn Lys Gly His Phe Asp Gly . His- Ala 6ly. 
2060 2065 2070 



Glu Ala . Pro Val Ser lie lie Ash Asn Al^ Val Tyr Thr Lys Val 
2075 2080 2085 



Asp Gly He Asp Val Glu He Phe Glu Asn Lys Thr Thr Leu. Pro 
2090 2095 2100 



Vai Asn Val Ala Phe Glu Leu- Trp Ala Lys Arg Asn He Lys .Pro 
2105 • 2110 2115 



Val Pro Glu' He Lys He Leu Asn Asn Leu Gly Val Asp He Ala 
2120 2125 ' 2130 



Ala Asn Thr Val He Trp Asp - Tyr Lys Arg Glu Ala • Pro Ala His 
' 2135 2140 2145 



Val $er , Thx He* Gly Val Cys Thr Met Thr Asp He - Ala Lys Lys 
2150* 2155 2160 



Pro Thr Glu Ser Ala Cys Ser Ser Leu Thr Val Leu Phe Asp Gly 
2165 2170 2175 



Arg Val Glu Gly Gin Val Asp Leu Phe Arg Asn Ala Arg Asn Gly 
2180 2185 2190 



Val Leu He Thr Glu Gly Ser Val Lys Gly Leu Thr Pro Ser Lys 
. 2195 2200 2205 



Gly Pro Ala Gin Ala Ser Val Ash Gly Val Thr Leu He Gly Glu' 
2210 2215 2220 



Ser Val . Lys Thr Gin Phe Asn Tyr Phe Lys Lys Val Asp Gly He 
2225 2230 2235 



He Gin Gin Leu Pro Glu Thr Tyr Phe Thr Gin Ser Arg Asp Leu 
2240 2245 2250 



Glu Asp Phe Lys Pro Arg Ser Gin Met Glu Thr Asp Phe Leu Glu 
2255 2260 2265 
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Leu Ala Met Asp Glu Phe. lie Gin Arg Tyr Lys Leu Glu Gly Tyx 
227.0 2275 • 2280 



Ala Phe Glu His jle.Val Tyr Gly Asp Phe Ser His Gly Gin Leu 
2285 . 2290 2295 



I 

Gly Gly Leu His Leu Met lie Gly Leu Ala Lys Arg Ser Gin Asp 
2300. 2305 . 2310 



Ser Pro Leu Lys Leu Glu Asp Phe He Pro Met Asp Set Thr Val 
2315 2320 2325 



Lys Asn Tyr Phe He Thr Asp Ala Gin Thr Gly Ser Ser Lys Cys 
2330 ' 2335 2340 



Val Cys Ser Val He Asp Leu Leu Leu Asp Asp Phe Val Glu He 
2345 2350 2355 



Il.e Lys Ser Gin Asp Leu Ser Val He Ser Lys Val Val -Lys Val 
2360 2365 2370 



Thr He Asp Tyr Ala Glu He Ser »Phe Met Leu Trp Cys Lys Asp 
2315 2380 . ' . 2385 



Gly His Val Glu Thr Phe Tyr Pro Lys Leu Gin Ala- Ser Gin Ala 
2390 2395 2400 



Trp Gin Pro Gly Val Ala Met Pro Asn Leu Tyr Lys Met Gin Arg 
2405 2410 2415 



Met Leu Leu Glu Lys Cys Asp Leu Gin Asn Tyr Gly Glu Asn Ala 
• 2420 . 2425 2430 



Val He Pro Lys Gly He Met ' Met Ash Val Ala Lys Tyr Thr Gin 
2435 2440 2445 



• Leu Cys Gin Tyr Leu Asn Thr Leu Thr Leu Ala Val Pro Tyr Asn 
2450 2455 2460 



Met Arg Val He His Phe Gly Ala Gly Ser Asp Lys Gly Val Ala 
2465 2470 2475 



Pro Gly Thr Ala Val Leu Arg Gin Trp Leu Pro Thi Gly Thr Leu 
2480 2485 2490 
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• • • . 



Leu Val Asp Ser Asp Leu' Asn Asp Phe Val Ser Asp Ala Asp Ser 
2495 • • 2500 2505 



Thr Leu lie Gly Asp Cys Ala Thr Val His Thr Ala Asn Lys Trp 
2510 * 2515 2520 

Asp Leu lie lie Ser Asp .Met Tyr Asp Pro Arg Thr Lys His Val 
2525 2530 2535 



Tbr Lys Glu Asn Asp Ser Lys Glu Gly Phe Phe Thr Tyr;Leu.Cys 
2540 2545 2550 



Gly Phe He Lys Gin Lys Leu Ala Leu Gly Gly Ser He Ala Val 
2555 2560 2565 



Lys He Thr Glu His Ser Trp Asn Ala Asp Leu Tyr Lys Leu Met 
2570 * 2575 2580 



Gly His Phe Ser Trp Trp Thr Ala Phe Val Thr Asu Val Asn Ala 
2585 2590 . 2595 



Ser Ser Ser Glu Ala Phe Leu He Gly Ala- Asn' Tyr Leu Gly Lys 
2600 2605 .2610 



Pro Lys Glu Gin Jle Asp Gly Tyr Thr Met. His .Ala Asn Tyr He 
2615 . • 2620 2625 



Phe Trp Arg Asn Thr Asn Pro He Gin Leu Ser Ser Tyr Ser Leu 
2630 2635 2640 



Phe Asp Met Ser Lys Phe Pro Leu Lys Leu Arg Gly Thr Ala Val 
2645 . '2650 2655' 



Met Ser Leu Lys Glu Asn Gin He Asn Asp Met He Tyr Ser Leu 
2660 2.665 2670 



Leu Glu Lys Gly Arg Leu He He Arg Glu Asn Asn Arg Val Val 
2675 2680 2685 



Val Ser Ser Asp He Leu Val Asn Asn 
2690 2695 ' 



<210> 65 

<211> 274 

<212> PRT 

<213> Severe acute respiratory syndrome virus 
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^400> 65 



Met Asp Leu Phe Met Arg Phe Phe Thr Leu Arg Ser lie Thr Ala Gin 
1 5 -10 15 



Pro Val Lys lie Asp Asn Ala Ser Pro Ala Ser Thr Val His Ala- Thr 

20 -25 30|. 

Ala Thr lie Pro Leu Gin Ala Ser Leu/Pro phe Qly Trp Leu Val He 
. • 35 40 .'45 



Gly Val Ala Phe Leu Ala Val Phe Gin Ser Ala Thr Lys. He He Ala 
50 * 55 60 



Leu Asn Lys Arg Trp Gin Leu Ala Leu Tyr Lys Gly Phe Gin Phe He 
65 . • .70 75 80 



Cys Asn Leu Leu Leu Leu Phe Val Thr "He Tyr Ser His Leu Leu Leu 

85 90 • 95 



Val Ala Ala Gly Met Glu Ala Gin Phe Leu Tyr Leu Tyr Ala Leu' He 

100 105 ' 110 



Tyr ?he Leu" Gin. Cys lie Asn Ala Cys Arg lie He Met Arg Cys Trp 
115 . 120 125 



Leu Cys Trp Lys Cys Lys Ser Lys Asn Pro Leu Leu Tyr Asp Ala Asn 
130 135 140 



Tyr Phe Val Cys Trp His Thr His Asn Tyr Asp Tyr Cys He Pro Tyr . 
145 150 • . 155 160 



Asn Ser Val Thr Asp Thr He Val Val Thr Glu Gly Asp Gly He Seif • 

165 170 175 



Thr Pro Lys Leu Lys Glu Asp Tyr Gin He Gly Gly Tyr Ser Glu Asp 

180 185 190 



Arg His Ser Gly Val Lys Asp Tyr Val Val Val His' Gly Tyr Phe Thr 
195 200 205 



Glu Val Tyr Tyr Gin Leu Glu Ser Thr Gin He Thr Thr Asp Thr Gly 
210 215 220 



He Glu Asn Ala Thr Phe Phe He Phe Asn Lys Leu Val Lys Asp Pro 
225 230 235 240 
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Pro Asn Val Gin He Hi*s Thr He Asp Gly Ser Ser.Gly Val .Ala Asn 

245 250 255 



Pro Ala Met Asp Pro He Tyr Asp Glu Pro Thr- Thr Thr Thr Ser Val 

260* . 265 270 • 



Pro Leu 



<210> 66 . • • 

<211> 154 

<212> PRT 

<213> Severe acute respiratory syndrome virus 

<400> 66 

Met Met Pro Thr Thr Leu Phe Ala Gly Tht His He Thr Met Thr Thr 
1 • ' 5 ' . 10 15 



Val Tyr His He Thr Val Ser Gin He Gin Leu Ser Leu Leu Lys Val 

20 25 30 



Thr .Ala Phe Gin His Gin Asn Ser Lys Lys Thr Thr Lys Leu Val Val 
35. 40 > ; 4.5 



He Leu. Arg He Gly Thr Gin Val Leu Lys Thr Met Ser Leu Tyr Met 
50 ■ . 55 60 ' * - 



Ala He Ser Pro Lys Phe Thr Thr Ser Leu Ser Leu His Lys Leu Leu 
65 ■ 70 . 75 80 



Gin Thr Leu Val Leu Lys Met Leu His Ser Ser Ser Leu Thr Ser Leu 

85 90 95 

a 

« 

Leu Lys Thr His Arg Met Cys Lys Tyr Thr Gin Ser Thr Ala Leu Gin 

100 105 • ' ' • 110 

ft 

Glu Leu Leu He Gin Gin Trp He Gin Phe Met Met Ser Arg Arg Arg 
115 120 • . 125 

• ■ 

Leu Leu Ala Cys Leu Cys Lys His Lys Lys Val Ser Thr Asn Leu Cys 
130 135 140 



Thr His Ser Phe Arg Lys Lys Gin Val Arg 
145 150 
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<210> 67 

•<211>' 63 • 

<212> PRT 

<213> Severe acute respiratory syndrome virus 

<400> 67 * 

Met Phe His Leu Val Asp Phe Gin Val Thr He Ala Glu He. Leu ' II^ 
1 . 5 • 10 ,jl5 

He lie Met Arg Thr Phe Arg lie Ala lie Trp Asn Leu Asp Val He 

20 25 30 



He JSer Ser He Val Arg Gin Leu Phe Lys Pro Leu Thr Lys Lys Asn 
35 40 45 



Tyr Ser Glu Leu Asp Asp Glu Glu Pro Met Glu Leu Asp Tyr Pro 
50 • 55 60 ' . 



<210> 68 . 

<211> 122 . 

<212> PRT . 

<213> Severe acute respiratory syndrome virus 

• ■ 

<400> 68 • . . • 

f 

Met Lys He He Leu Phe lieu Thr Leu He Val Phei Thr Ser Cys. Glu 
.1 V* • 5 * 10 ^ 15 



Leu Tyr His Tyr Gin Glu Cys' Val Arg Gly Thr Thr Val Leu Leu Lys 

20 25 30 



?lu Pro Cys Pro Ser Gly Thr Tyr Glu Gly Asn Ser Pro Phe His Pro 
35 40 45 . . 



Leu Ala Asp Asn Lys Phe. Ala Leu Thr Cys Thr Ser Thr His Phe Ala 
50 55 60 



Phe Ala Cys Ala Asp Gly Thr Arg His Thr Tyr Gin Leu Arg Ala Arg 
65 70 .75 80 



Ser Val Ser Pro Lys Leu Phe He Arg Gin Glu Glu Val Gin Gin Glu 

85 90 95 



Leu Tyr Ser Pro Leu Phe Leu He Val Ala Ala Leu Val Phe Leu He 

100 . 105 110 



Leu Cys Phe Thr He Lys Arg Lys Thr Glu 
115 120 
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<210> 69 • ' . 

<211> 44 • . * 

<212:^ .PRT 

<213> Severe acute respiratory syndrome virus 

m 

<400> 69 

Met Asn Glu Leu Thr Leu lie Asp Phe Tyr Leu Cys Phe Leu Ala Phe 
1 5 . 10 ' 15 



Leu Leu Phe Leu Val Leu lie Met Leu lie lie .Phe Trp Phe Ser Leu 

20 ' 25 30 



Glu He Gin Asp Leu Glu. Glu Pro Cys Thr Lys Val 
35 '40 



<210> 70 

<211> 39 

<212> PRT » • . 

•<213> Severe acute respiratory syndrome virus 

<4D0> 70 

. * • ■ * 

Met Lys Leu Leu lie Val Leu Thr Cys He Ser Leu Cys Ser Cys lie 
1 5 10 • ' 15 

• ■ • • ' . . ■ * 

Cys TKr Val Val Gin krg Cys P^a Sex Asn Lys Pro His Val Leu Glu 

.20. 25 • 30 ' 



Asp Pro Cys Lys Val ^^In His 





35 


<210> 


71 


<211> 


84 


• <212> 


PRT 


<213> 


Severe acute 

• 


<400> 


■ 

71 


Met Cys Leu Lys He 


1 


• 5 



10 ^15 



Ser Thr Ala Trp Leu Cys Ala Leu Gly Lys Val Leu Pro Phe His Arg 
• 20 25 . 30 

Trp His Thr Met Val Gin Thr Cys Thr Pro Asn Val Thr He Asn Cys 
35 40 45 



Gin Asp Pro Ala Gly Gly Ala Leu He Ala Arg Cys Trp Tyr Leu His 
50 55 60 
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Glu Gly His Gin Thr Ala Ala Phe Arg Asp Val Leu Val Val Leu Asn 
65 ■ . 70 .75 . 80 



Lys Arg Thr Asn 



<210> 72 . ' ' . . 

<211> 98 

<212> PRT 

<2.13> Severe acute respiratory syndrome virus ; 

<400> -72 . 
* 

Met Asp Pro Asn Gin Thr Asn Val Val Pro Pro Ala Leu His Leu Val 

1 . 5 10 • 15 ■ 



Asp Pro Gin lie Gin Leu Thr lie Thr Arg Met Glu Asp* Ala Met Gly 

20 . ' 25 ' 30 ' . 



Gin Gly. Gin Asn Ser Ala Asp Pro Lys Val Tyr Pro lie lie Leu Arg 
35 40 . 45 



Leu Gly Ser Gin Leu Ser Leu Ser Met Ala Arg Arg Asn Leu Asp Ser 
.50 55 . . €0 

Leu Glu Ala Arg Ala Phe Gin Ser Thr Pro lie Val* Val Gin Met Thr. 
65 70 '75 80 



Lys Leu Ala Thr Thr Glu Glu Leu Pro Asp Glu Phe Val Val Val Thr 

85 90 95 



Ala Lys 



<210> 73 

<211> 70 

<212> PRT 

<213> Severe acute respiratory syndrome virus 

« 

<400> 73 

Met Leu Pro Pro Cys Tyr Asn Phe Leu Lys Glu Gin His Cys Gin Lys 
1 5 10 15 



Ala Ser Thr Gin Arg Glu Ala Glu Ala Ala Val Lys Pro Leu Leu Ala 

20 25 30 



Pro His His Val Val Ala Val lie Gin Glu lie Gin Leu Leu Ala Ala 
35 40 45 
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9 

Val Gly Glu He Leu Leti .Leu Glu Trp Leu Ala Glu.Val Val Lys Leu 
50 55 • 60 • 



Pro Ser Arg Tyr Cys Cys 
65 70. 



<210> 74 

<211> 6 

<212> RNA 

<213> Corona virus 

<400> 74 
cuaaac 



<210> 75 

<211> 13 

<212> PRT 

<213> Severe 4cute respiratory syndrome virus 

<400> 75 

Met Phe He Phe Leu Leu Phe Leu Thr Leu Thr Ser Gly 
1 '5 10 



«^210> 76 . • • I. 

<211> 23 ■ * ' 

<212> PRT . * 

<213> Severe acute respiratory syndrome virus 

<400> 76 

Thr He Pro Leu Gin Ala Ser Leu Pro Phe Gly Trp Leu Val He Gly 
1 5 • 10 . 15 * 



Val Ala Phe Leu Ala Val Phe 

20 



<210> 77 ■ 
<211> 23 . 
<212> PRT 

<213> Severe acute respiratory syndrome virus 
<400> 77 

Phe Gin Phe He Cys Asn Leu Leu Leu Leu Phe Val Thr He Tyr Ser 
15 10 15 



His Leu Leu Leu Val Ala Ala 

20 



<210> 78 
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<211> 
<212> 
<213> 



23 
PRT 

Severe, acute respiratory syndrbme virus. . 



<400> 



78 



Ala Gin Phe Leu tyr Leu Tyr Ala Leu He Tyr Phe Leu Gin Cys He 
1 5 . IQ . 15 



Asn Ala Cys Arg He He Met 

20 



<21-0> 79 
<211> 18 
<212> PRT 

<213> * Severe acute respiratory syndrome virus 
<400> 79 • • " . 

Val Leu Leu Phe Leu Ala Phe Val Val Phe Leu Leu Val Thr Leu Ala 
1 -5 10.. 15 

I 

He Leu ' 



<210> 80 

<2ii> 23 . • • : I 

<212> PRT 

<213> Severe acute respiratory syndrome virus 

<400> 80 

•Leu Leu Glu Gin Trp Asn Leu Val He Gly Phe Leu Phe Leu Ala Trp 
i 5 ■ .10 .15 



He Met Leu Leu Gin Phe Ala 

20 



<210> 81 

<211> 23 

.<212> PRT 

<213> Severe acute respiratory syndrome virus 



<400> 81 

Leu Val Phe Leu Trp Leu Leu Trp Pro Val Thr Leu Ala Cys Phe Val 



1 



5 



10 



15 



Leu Ala Ala Val Tyr Arg He 

20 



<210>- 82 
<211> 23 
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<212> PRT ' • ■ 

<213>' Severe acute . respiratory syndrome virus ' 

■ 

<400> 82 • 

Gly Gly He Ala He Ala Met Ala Cys He Val Gly Leu Met Trp Leu 
1 5 , 10 15 



Ser Tyr Phe Val Ala Ser Phe 

20 



<210> 83 
<211> -20 
<212> PRT 

<213> Severe acute respiratory syndrome virus 
<400> 83 

His Leu Val Asp Phe Gin Val Thr He Ala Glu He Leu He He He 
15 10 15 



Met Arg Thr Phe 

20 



<210> 84 
<211> 15 
<212> PRT 

*<213> Severe acute respiratory syndrome virus 
<400> 84 

■ 

■ 

Met Lys lie He Leu Phe Leu Thr Leu He Val Phe Thr Ser Cys 
1 5 ' 10 15 



<210> 85 

<211> 19 

<212> PRT 

<213> Severe acute respiratory syndrome virus 

■ 

<400> 85 

■ ■ 

Ser Pro Leu Phe Leu He Val Ala Ala Leu Val Phe Leu He Leu Cys 
1 5 ' 10 * -15 



Phe Thr He 



<210> 86 

<211> 83 

<212> PRT 

<213> Severe acute respiratory syndrome virus 

<400> 86 
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m 

Glu Leu Tyr His Tyr Gin Glu Cys Val Arg .Gly Thr thr Val Leu Leu 
1 5 . 10 • * - 15- 

* • * 

m « « . 

Lys Glu Pro. Cys Pro Ser Gly Thr Tyr. Glu Gly Asn Ser Pro Phe Kis 

■ 20 ■ '25 * -30 . 



Pro Leu Ala Asp Asn Lys Phe Ala Leu Thr Cys Thr Ser Thri His Phe 
35 40 45 . 



Ala Phe Ala Cys Ala Asp Gly Thr Arg His Thr-.Tyr Gin Leu Arg Ala 
50 55 60 



Arg Ser Val Ser Pro Lys Leu Phe lie Arg Gin Glu Glu Val Gin Gin 
65 • 70 . • - 75 * 80 



Glu Leu Tyr 



<210> 87 

<211> 37 

<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Primer 



<400> 87 . 

caggaaacag ctatgacacc aagaacaagg ctctcca 37 



<210> 


88 


<211> 


37 


<212> 


DNA 


<213> 


Artificial Sequence 


<220> 




<223> 


Primer • 


<400> 


88 


caggaaacag ctatgacgat agggcctctt 

• 

■ 


<210> 


89- 


<211> 


496 



<212> DNA 

<213> Severe acute respiratory syndrome virus 



37 



<220> 

<221> misc^feature 

<222> (11) ••(11) 

<223> n is a, c, g, or t • 

<400> 89 

acctacccag ngaaaagcc.a accaacctcg atctcttgta gatctgttct ctaaacgaac 60 
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• • • 



tttaaaatct gtgtagctgt cgctcggctg catgcctagt- gcacctacgc agtataaaca 120 

ataataaatt ttactgtcgt tgacaagaaa cgagtaactc gtccctcttc tgcagactgc 180 

• I • - 

ttacggtttc gtccgt'gttg cagtcgatca tcagcatacc taggtttcgt ccgggtgtga 240 

* • • • 

ccgaaaggta agalfggagag ccttgttctt ggtgtcaacg agaaaacaca cgtccaactc 300 

agtttgcctg tccttcaggt tagagacgtg ctagtgcgtg gcttcgggga ctctgtggaa 360 

* • * " 

gaggccctat cggaggcacg tgaacacctc aaaaatggca cttgtggtqt agtagagctg 420 

: ... 

gaaaaaggcg tactgcccca gcttgaacag ccctatgtgt tcattaaacg *ttctgatgcc . 480 

ttaagcacca atcacg 496 



<2i0> 90 * 
<211> 523 
<212> DNA- 

<213> Severe acute respiratory syndrome virus - . 

<400> 90 

gtcgacaaca atttctgtgg cccagatggg taccctcttg attgcatcaa agattttctc 60 

■ 

gcacgcgcgg gcaagtcaat gtgcactctt tccgaacaac ttgattacat cgagtcgaag . 120 

m 

» 

agaggtgtct actgctgcog tgaccatgag catgaaattg cctggttcac tgagcgctct • ' 180 

' . ' • . ■ *• 

. . . • . . • 

gataa^agct acgagcacca gacacccttc gaaattaaga gtgccaagaa atttgacact 240 
ttcaaagggg aatgcccaaa gtttgtgttt cctcttaact caaaagtcaa agtcattcaa 300 
ccacgtgttg aaaagaaaaa gactgagggt ttcatggggc gtatacgctc tgtgtaccct . 360 

t 

gttgcatctc cacaggagtg taacaatatg cacttgtcta ccttgatgaa atgtaatcat 420 

■ 

t;gcgatgaag tttcatggca gacgtgcgac tttctgaaag ccacttgtga acattgtggc 480 
actgaaaatt tagttattga aggacctact .acatgtgggt acc 523 



<210> 91 
<211> .324 

<212> DNA 

<213> Severe acute respiratory syndrome virus 
<400> 91 ' . 

cttaggtgac gagcttggca ctgatcccat tgaagattat gaacaaaact ggaacactaa 60 

■ 

gbatggcagt ggtgcactcc gtgaactcac tcgtgagctc aatggaggtg cagtcactcg 120 
ctatgtcgac aacaatttct gtggcccaga tgggtaccct cttgattgca tcaaagattt 180 
tctcgcacgc gcgggcaagt caatgtgcac tctttccgaa caacttgatt acatcgagtc 240 
gaagagaggt gtctactgct gccgtgacca tgagcatgaa attgcctggt tcactgagcg 300 
ctcctgataa gagctacgag cacc 324 
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■ « 

* ■ , ■ 

* 

<210> 92 . 
<211> 495. 
<212> DNA 

<213> Severe acute respiratory syncjrome virus 
<400> 92 . 

tgctataata agcgtgccta atgggttcct cgtgctagtg ctgatattgg gctcaggcca* 60 

1 

tactggcatt actggtgaca atgtggagac cttgaatgag gatctccttg agatactgag 120 

■ * 

tcgtgaacgt gttaacatta acattgttgg cgattttcat ttgaatgaag aggttgccat ' 180 

• * 

cattttggca tctttctctg cttctacaag tgcctttatt gacactataa agagtcttga 240 

* 

ttacaagtct ttcaaaacca ttgttgagtc ctgcggtaac tataaagtta ccaagggaaa 300 

gcccgtaaaa ggtgcttgga acattggaca acagagatca gttttaacac cactgtgtgg 360 

ttttccctca caggctgctg g'tgttatcag .atcaattttt gcgcgcacac ttgatgcagc ■ 420 

« - 

aaaccactca attcctgatt tgcaaagagc agctgtcacc atacttgatg gtatttctga 480 
acagtcatta cgtct 495 

• « ■ 

<210> 93 . 

<211> 486 * • 

<212> DNA 

<213> Severe acute respiratory syndrome virus 

<400> 93 . ' 

gccactcaaa cattgaaact cgactccgca agggaggtag gactagatgt tttggaggct 60 
gtgtgttt'gc ctatgttggc tgcftataata agcgtgccta ctgggttcct cgtgctagtg 120 
ctgatattgg ctcaggccat actggcatta ctg^t^acaa ^tgtggagacc ttgaatgagg 180 
atctccttga gatactgagt cgtgaacgtg ttaacattaa cattgttggc gattttcatt 240 
tgaatgaaga . ggttgccatc attttggcat ctttctctgc ttctacaagt gcctttattg 300 

* « 

acactataaa gagtcttgat tacaagtctt tcaaaaccat tgttgagtcc tgcggtaact 360 

♦ 

* ■ 

■ 

ataaagttac caagggaaag cccgtaaaag gtgcttggaa cattggacaa cagagatcag 420 
ttttaacacc actgtgtggt tttccctcac aggctgctgg. tgttatcaga tcaatttttg 480 

m 

t 

cgcgca 486 



<210> 94 
<211> 567 
<212> DNA 

<213> Severe acute respiratory syndrome virus 
<400> 94 

cactactgtg gaaaaactca ggcctatctt tgaatggatt gaggcgaaac ttagtgcagg 60 
agttgaattt ctcaaggatg cttgggagat tctcaaattt ctcattacag gtgtttttga 120 
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■ 

catcgtcaag ggtcaaatac-aggttgcttc.agataacatc aa'ggattgtg takaatgctt 180 

» • ... 

cattgatgtt gttaacaagg cactcgaaat gtgcattgat caagtcacta tcgct.ggcgc 240 

* m 

• . . . 

aaagttgcga tcactcaact taggtgaagt cttcatcgct caaagcaa.gg gactttaccg * 300 
tcagtgtata cgtggcaagg agcagctgca actactcatg cctcttaaqg caccaaaaga' 360 

Mm' ' ■ • •. 

. ■ - t • 

agtaaccttt cttgaaggtg attcacatga cacagtactt acctctgagg. aggttgttct ■ '420 

* 

caagaacggt gaactcgaag cactcgagac gcccgttgat agcttcacaa atggagctat 480 

■ * * . 

ggttggcaca ccagtctgtg taaatggcct catgctctta gagattaagg acaaagaaca 540 

■ 

atactgcgca ttgtctcctg gtttact 567 

• ■ 

■ 

<210> 95 
<211> 516 
<212> DNA 

<213> ^ Severe acute respiratory syndrome virus 

■ 

<400> 95 * • 

gggagattct caaatttctc attacaggtg 'tttttgacat pgtcaagggt caaatacagg 60 

• * 

ttgcttcaga taacatcaag gattgtgtaa aatgcttcat tgatgttgtt aacaaggcac, 120 

■ 

tcgaaatgtg cattgatcaa gtcactatcg ctggcgcaaa gttgcgatca ctcaacttag ' 180 

gtgaagtctt catcgctcaa agcaagggac tt-taccgtca sftgtatacgt ggcaaggagc ^40 

•I. . • • 

agctqcaact actca'tgcot cttaaggcac caaaagaagt aacctt.tct't gaaggtgatt 300 

cacatgacac agtacttacc .tctgaggagg ttgttctcaa gaacggtgaa ctcgaagcac ' 360 

► t * • 

ticgagacgcc cgttgatagc ttcacaaatg gagctatcgt tggcacacca gtctgtgtaa .420 

atggcctcat gctcttagag attaaggaca aagaacaata ctgcgcattg tctcctggtt 480 

tactggctac aaacaatgtc tttcgcttaa aagggg 516 

m 

<210> 96 
<211> 448 
<212> DNA 

<213> Severe acute respiratory syndrome virus 
<400> 96 

agttcgagtt gaggaagaag aagaggaaga ctggctggat gatactactg agcaatcaga . 60 
gattgagcca gaaccagaac ctacacctga agaaccagtt aatcagttta ctggttattt 120 
aaaacttact gacaatgttg ccattaaatg tgttgacatc gttaaggagg cacaaagtgc . 180 
taatcctatg gtgattgtaa atgctgctaa catacacctg aaacatggtg gtggtgtagc 240 
aggtgcactc aacaaggcaa ccaatggtgc catgcaaaag gagagtgatg attacattaa 300 
gctaaatggc cctcttacag taggagggtc ttgtttgctt tctggacata atcttgctaa 360 
gaagtgtctg catgttgttg gacctaacct aaatgcaggt gaggacatcc agcttcttaa 420 
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ggcagcatat gaa3atttca attcacag 448 



<210> 97 

<211> 333 ' • ^ 

<212> DNA ' 

■ 

<213>'. Severe acute respiratory syndrome virus 

• • • 

I 

<400> SI ' . . 

agaggatgat tatcaaggtc tccctctgga atttggtgcc tcagctgaaa* cagttcgagt 60 

' _ • • • 

tgaggaagaa gaagaggaag actggctgga tgatactact gagcaatcag agattgagcc 120 

agaaccagaa cctacacctg aagaaccagt taatcagttt actggttatt taaaacttac 180 

tgacaatgtt gccattaaat gtgttgacat cgttaaggag gcacaaagtg ctaatcc.tat 240 

ggtgattgta aatgctgcta acatacacct gaaacatggt ggtggtgtag caggtgcact 3.00 

m 

m 

caacaaggca accaatggtg ccatgcaaaa g^a 333 

<210> 98 . . 
. <211> 399 

<212> DNA * * 

. <213> Severe acute respiratory syndrome virus 

<4"00> 98 ■ . . ■ ' . • . 

gagatgctct caagagcttt gaagaaagtg ccagttgdtg a:gtatataac •cacgtaccct- 60 

' ft 

ggac^aggat gtgctggtta tac.acttgag gaagctaaga ctgctctta^ ga^atgcaaa 120 

tctgcatttt atgtactacc ttcagaagca cctaatgcta aggaagagat tctaggaact-' 180 

gtatcctgga atttgagaga aatgcttgct catgctgaag agacaagaaa attaatgcct .240 

atatgcatgg atgttagagc cataatggca accatccaac gtaagtataa aggaattaaa 300 

■ 

attcaagagg gcatcgttga ctatggtgtc cgattcttct tttatactag taaagagcct 360 
gtagcttcta ttattacgaa gctgaactct ctaaatgag 399 



<210> 99 

<211> 437 

<212> DNA. 

<213> Severe acute respiratory syndrome virus 

4 

<400> 99 

agaaatctgt cgtacagaag cctgtcgatg tgaagccaaa aattaaggcc tgcattgatg 60 
aggttaccac aacactggaa gaaactaagt ttcttaccaa taagttactc ttgtttgctg • 120. 

atatcaatgg taagctttac catgattctc agaacatgct tagaggtgaa gatatgtctt 180 

tccttgagaa ggatgcacct tacatggtag gtgatgttat cactagtggt gatatcactt 240 

gtgttgtaat accctccaaa aaggctggtg gcactactga gatgctctca agagctttga 300 

agaaagtgcc agttgatgag tatataacca cgtaccctgg acaaggatgt gctggttata 360 
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cacttgagga agctaagact gctdttaaga aatgcaaatc tgcattbtat gtactacctt 420 
cagaagcacc taatgct 437 



<210> 100 

<211> 569 * " . ... 

<212> DNA 

<213> Severe acute respiratory syndrome virus 
<400> 100 

cctctatcgt attgacggag ctcaccttac aaagatgtca ,gagtacaaag gaccagtgac 60 

* 

♦ 

tgatgttttc tacaaggaaa catcttacac tacaaccatc aagcctgtgt cgtataaact 120 
cgatggagtt acttacacag a.gattgaacc aaaattggat gggtattata aaaaggataa 180 
tgcttactat acagagcagc ctatagacct tgtaccaact caaccattac caaatgcgag 240 
ttttgataat ttcaaactca catgttctaa cacaaaattt gctgatgatt taaatcaa^t 300 

■ 

gacaggcttc acaaagccag cttcacgaga gctatctgtc acattcttcc cagacttgaa 360 

tggcgatgta gtggctattg actatagaca ctattcagcg agtttcaaga aaggtgctaa 420 

attactgcat aagccaattg tttggcacat taaccaggct acaaccaaga caacgttcaa 480 

accaaacact tggtgtttac gttgtctttg gagtacaaag ccagtagata cttcaaattc 540 

atttgaagtt ctggcagtad aagacacat' * 569 

<210> 101 
<211> 187 • 
<212> DNA 

<213> Severe acute respiratory syndtoiae virus 
<400> 101. 

tcagcagata cttcaaattc atttgaagtt ctggcagtag aagacacaca aggaatggac 60 
aatcttgctt gtgaaagtca acaacccacc tctgaagaag tagtggaaaa tcctaccata 120 
cagaaggaag tcatagagcg tgacgtgaaa actaccgaag ttgtaggcaa tgtcatactt 180 
aaaccat 187 



<210> 102 • . 
<211> 271 
<212> DNA 

<213> Severe acute respiratory syndrome virus 
<400> 102 

aaatgcgacg agtctgcttc taagtctgct tctgtgtact acagtcagct gatgtgccaa . 60 
cctattctgt tgcttgacca agctcttgta tcagacgttg gagatagtac tgaagtttcc 120 
gttaagatgt ttgatgctta tgtcgacacc ttttcagcaa cttttagtgt tcctatggaa 180 

* 

aaacttaagg cacttgttgc tacagctcac agcgagttag caaagggtgt agctttagat 240 
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* • - ■ 

ggtgtccttt ctacattcgt gtcagctgcc c 271 
<210> 103 * 

<211> 363 ■ , 

<212> DNA * * . 

<213>* Severe acute respiratory syndrome virus 

■ • I 

<400> 103* . . • 

catttcatca gcaattcttg gctcatgtgg tttatcatta gtattgtaca* aatggcaccc 60 
gtttctgcaa tggttaggat gtacatcttc tttgcttctt tctactacat atggaagagc 120 

■ 

« 

tatgttcata tcatggatgg ttgcacctct tcgacttgca tgatgtgcta taagcgcaat 180 

cgtgccacac gcgttgagtg tacaactatt gttaatggca tgaagagatc tttctatgtc 240 

tatgqaaatg gaggccgtgg cttctgcaag actcacaatt ggaattgtct caattgtgac 3.00 

acattttgca ctggtagtac attcattagt gatgaagttg ctcgagattt gtcactccag 360 

ttt ■ • 363 

k ■ • ■ ' 

B 

<210> 104 . 
<'211> 500 
<212> DNA 

<213> Severe acute respiratory syndrome virus ' * 

■ 

<400> • 104 " 

■ 1. . • 

agag^tcttg gcgcatgtat tgactgtaat gcaaggcata tcaatgccca aggtagcaaa - 60 

aagtcacaat gtttcactca .tctggaatgt aaaagactac atgtctttat ctgaacagct ' 120 

m 

« m 

» • 

gcgtaaacaa attcgtagtg ctgccaagaa gaacaacata ccttttagac taacttgtgc 180 
tacaactaga caggttgtca atgtcataac tactaaaatc tcactcaagg gtggtaagat 240 

* 

tgttagtact tgttttaaac ttatgcttaa ggccacatta ttgtgcgttc ttgctgcatt 300 
ggtttgttat atcgttatgc cagtacatac attgtcaatc catgatggtt acacaaatga 360 

■ 

» ■ 

aatcattggt tacaaagcca ttcaggatgg tgtcactcgt gacatcattt ctactgatga 420 

* 

ttgttttgca aataaacatg ctggtttt'ga cgcatggttt agccagcgtg gtggttcata 480 

. ■ ■ 

caaaaatgac aaaagctgcc 500 

m m 

<210> 105 
<211> 537 
<212> DNA 

<213> Severe acute respiratory, syndrome virus 
<400> 105 

cattgtcaat ccatgatggt tacacaaatg aaatcattgg ttacaaagcc attcaggatg 60 
gtgtcactcg tgacatcatt tctactgatg attgttttgc aaataaacat gctggttttg 120 
acgcatggtt tagccagcgt ggtggttcat acaaaaatga caaaagctgc cctgtagtag 180 
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• : . ' • . • 

« 

« 

ctgctatcat tacaagagag att^gtttca tagtgcctgg cttactgggt actgtgctga 240 

• * < 

gagcaatcaa tggtgacttc ttgcattttc tacctcgtgt t'tttagtgct gttggcaaca ' 300 

« 

■ 

tttgctacac accttccaaa ctcattgagt atagtgattt tgctacctct gcttgcgttc 360 

• ^ • 

ttgctgctga gtgtacaatt fettaaggatg ctatgggcaa acctgtgcca tattgttatg* 420 
acactaattt gctagagggt tctatttctt ^tagtgagct tcgtccagac actcgttatg 480 
tgcttatgga tggttccatc atacagtttc ctaacactta cctggagggg tctgtta 537 

<210> 106 

<211> 427 ■ ' 

<212> DNA 

<213> Severe acute respiratory syndrome virus 
'<400> 106 

cacttttgtt tttgatgtct ttcactatac tctgtctggt accagcttac agctttctgc 60 
cgggagtcta' ctcagtcttt tacttgtact tgacattcta tttcaccaat gatgtttcat 120 
tcttggctca ccttcaatgg tttgccatgt tttctcctat tgtgcctttt tggataacag 180 
caatctatgt attctgtatt tctctgaagc actgccattg gttctttaac-aactatctta* 240 

i 

ggaaaagagt catgtttaat ggagttacat ttagtacctt cgaggaggct gctttgtgta 300 
cctttttgct - caacaaggaa atgtacctaa aattgcgtag cgagacactg ttgccactta . 360 

* • • • • 

cacagtataa caggtatctt gctctatata acaagtacaa gtatttcagt ggagccttag 420 

ft 

atactac* ' 427 

•i 

<210> 107 * . 

<211> 537 
<212> DNA 

<213> Severe acute respiratory syndrome virus 
<400> 107 

agtaacaact tttgatgctg agtactgtag acatggtaca tgcgaaaggt cagaagtagg '60 
tatttgccta tctaccagtg gtagatgggt tcttaataat gagcattaca gagctctatc 120 
aggagttttc tgtggtgttg atgcgatgaa tctcatagct aacatcttta ctcctcttgt 180 
gcaacctgtg ggtgctttag atgtgtctgc ttcagtagtg gctggtggta ttattgccat 240 
attggtgact tgtgctgcct actactttat gaaattcaga cgtgtttttg gtgagtacaa 300 
ccatgttgtt gctgctaatg cacttttgtt tttgatgtct ttcactatac tctgtctggt 360 
accagcttac agctttctgc cgggagtcta ctcagtcttt tacttgtact tgacattcta 420 
tttcaccaat gatgtttcat tcttggctca ccttcaatgg tttgccatgt tttctcctat 480 
tgtgcctttt tggataacag caatctatgt attctgtatt tctctgaagc actgcca 537 
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•<210>" 108 
<211> 551 
<212> DNA 

* • • • 

<213> Severe acute respiratory syndrome virus 

<4oo> 108 * ; * • 

agtat'actgt ccaagacatg tcatttgcac agcagaagac atgcttaatc ctaactajbga 60 « 

I 

agatctgctc attcgcaaat ccaaccatag ctttcttgtt caggctggca atgttcaact 120 

tcgtgttatt ggccattcta tgcaaaattg tctgctt.agg cttaaagttg atacttctaa 180 

• • ■ - * 

cpctaagaca cccaagtata aatttgtccg tatccaacct ggtcaaacat tttcagttct . 240 

agcatgctac aatggttcac catctggtgt ttatcagtgt gccatgagac ctaatcatac 300 

cattaaaggt tctttcctta atggatcatg tggtagtgtt ggttttaaca ttgattatga 360 

ttgcgtgtct ttctgctata tgcatcatat ggagcttcca acaggagtac acgctggtac 420 

ft 

■ 

tgacttagaa ggtaaattct atggtccatt tgttgacaga caaactgcac aggctgcagg 480 

i * ■ 

tacagacaca accataacat taaatgtttt ggcatggctg tatgctgctg ttatcaatigg 540 . 
tgataggtgg t 551 



<210> 109 . . • . 

<211> 593 

<212> DNA 

<213^ Severe acute respiratory syndrome virus* 

• * * 

<400> 109 • . 

acttagcaaa ggctctaaat gactttagca actcaggtgc tgatgttctc taccaaccac • 60 

cacagacatc aatcacttct" gctgttctgc agagtggttt taggaaaatg gcattcccgt 120 

■ i t 

paggcaaagt tgaagggtgc atggtacaag taacctgtgg aactacaact cttaatggat 180 
tgtggttgga tgacacagta tactgtccaa gacatgtcat ttgcacagca gaagacatgc 240 

w 

ttaatcctaa ctatgaagat ctgctbattc gcaaatccaa ccatagcttt cttgttcagg • 300 

ft 

ctggcaatgt tcaacttcgt: gttattggcc attctatgca aaattgtctg cttaggctta 360 

aagttgatac ttctaaccct aagacaccca agtataaatt tgtccgtatc caacctggtc 420 

aaacattttc agttctagca tgctacaatg gttcaccatc tggtgtttat cagtgtgcca . 480 

tgagacctaa tcataccatt aaaggttctt tccttaatgg atcatgtggt agtgttggtt 540 

ttaacattga ttatgattgc gtgtctttct gctatatgca tcatatggag ctt 593 



<210> 110 

<211> 504 

<212> DNA 

<213> Severe acute respiratory syndrorae virus 

<400> 110 
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* ■ • 

tgtgctgctt tgaaagagct gctgcagaat gggtatgaat ggtcgtacta tccttggtag 60 

cactatttta gaagatgagt ttacaccatt tgatgttgtt agacaatgct ctggtgttac 120. 

cttccaaggg taagttcaag aaaattgtta agggcactca tcattggatg cttttaactt 180 

tcttgacatc actattgatt cttgttcaaa gtacacagtg gtcactgttt ttctttgttt 240 

acgagaatgc tttcttgcca tttactcttg gtattatggc aattgctgca tgtgctatgc 300 

* a 

tgcttgttaa gcataagcac gcattcttgt gcttgtttct gttaccttct cttgcaacag 360 

ttgcttactt taatatggtc tacatgcctg ctagctgggt gatgcgtatc atgacatggc 420 

ttgaattggc tgacactagc ttgtctggtt ataggcttaa ggattgtgtt atgtatgctt 480 

cagctttagt tttgcttatt ctca 504 

<210> 111 
<211> 298 
<212> DNA. 

<213> severe a'cute' respiratory syndrome virus 
<400> 111 

taggcttaag gattgtgtta tgtatgcttc agctttagtt ttgottattc tcatgacagc 60 

m 

tcgcactgtt tatgatgatg ctgctagacg tgtttggaca- c1:gatgaatg tcattacact 120 

tgtttacaaa gtctactatg gtaatgcttt agatcaagct. atttccatgt gggccttagt 180 

tatttctgta acctctaaet attctggtgt cgttacgact atcatgtttt tagctagagc 240 

tatagtgttt gtgtgtgttg agtattaccc attgttattt attacctggc aacacctt 298 

•■ . . . • . 

m 

H 

<210> 112 ' 
<211> 530 
<212> DNA 

<213> Severe acute respiratory syndrome virus 
• <400> 112 

aaacaggcaa gatctgaggd caagagggca aaagtaacta gtgctatgca aacaatgctiz 60 
ttcactatgc ttaggaagct tgataatgat gcacttaaca acattatcaa caatgcgcgt 120 
gatggttgtg ttccactcaa catcata'cca ttgactacag cagccaaacf catggttgtt 180 
gtccctgatt atggtaccta caagaacact tgtgatggta acacctttac atatgcatcf 240 

« 

gcactctggg aaatccagca agttgttgat gcggatagca agattgttca acttagtgaa 300 

attaacatgg acaattcacc aaatttggct tggcctctta ttgttacagc tctaagagcc . 360 

aactcagctg ttaaactaca gaataatgaa ctgagtccag tagcactacg acagatgtcc 420 

tgtgcggctg gtaccacaca aacagcttgt actgatgaca atgcacttgc ctactataac 480 

aattcgaagg gaggtaggtt tgtgctggca ttactatcag accaccaagc 530 
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■ • 

<210> 113 
<211> 605- 
<212> DNA 

<213>- Severe acute respiratory syndrome virus 



• 

<4'00> 113 • , 
gaagtcgttc tcaaaaagtt aaagaaatct ttgaatgtgg ctaaatctga 


• 

gtttgaccgt 

• 


60 


• 

gatgctgcca tgcaacgcaa gttggaaaag atggcagatc aggctatgac (ccaaatgtac 


120 


ciaa(.«ai^^C'Cia (^ctcoLycic^^a oaac^og^gcai aaagi>aaC<LCi g^gCuougca 


aacaatgctc 


18.0 


ttcactatgc ttaggaagct tgataatgat gcacttaaca .acattatcaa 


caatgcgcgt 


* 

240 


gatggttgtg ttccactcaa catcatacca ttgactacag cagccaaact 


catggttgtt * 


300 


gtccctgatt atggtaccta caagaacact tgtgatggta acacctttac 

• 


atatgcatct 


360 


gcactctggg aaatccagca agttgttgat gcggatagca agattgttca 

* 


acttagtgaa 


420 


• 

attaacatgg acaattcacc aaatttggct tggcctctta ttgttacagc 

• 


tctaagagcc 


480 


* 

aactcagctg ttaaactaca gaataatgaa ctgagtccag tagcactacg 


acagatgtcc 


540 


* 

tgtgcggctg gtaccacaoa aacagcttgt actgatgaca atgcacttgc 


ctactataac 

• 


600 


aattc 

* 

• 


• 

• 

• 

• 


605 


• 

<210> 114 

<211> 176 .. . 1 

<2i2> om 

<213> Severe acute respiratory syndrome virus 


r 

w 

1 


• 


<400> 114 

acactggtac aggacaggca attactgtaa caccagaagc taacatggac 


caagagtcct 


60 


ttggtggtgc ttcatgttgt ctgtattgta .gatgccacat tgaccatcca 


aatcctaaag 


.120 


gattctgtga cttgaaaggt' aagtacgtcc aaatacctac cacttgtgct 


aatgat 


176 

* 


<210> 115 
<211> 516 
<212> DNA 

^213> Severe acute respiratory syndrome virus 

■ ■ ■ * 


• 


• 

■ 


<400> lis . 

• actqtaacSiC cagaagctaa catggaccaa gagtcctttg gtggtgcttc 


atgttgtctg' 


60 


tattgtagat gccacattga ccatccaaat cctaaaggat tctgtgactt 


gaaaggtaag 


120 


tacgtccaaa tacctaccac ttgtgctaat gacccagtgg gttttacact 


tagaaacaca 


180 


gtctgtaccg tctgcggaat gtggaaaggt tatggctgta gttgtgacca 

• 


actccgcgaa 


240 


cccttgatgc agtctgcgga tgcatcaacg tttttaaacg ggtttgcggt 


gtaagtgcag 


300 


cccgtcttac accgtgcggc acaggcacta gtactgatgt cgtctacagg 


gcttttgata 


360 


tttacaacga aaaagttgct ggttttgcaa agttcctaaa aactaattgc 


tgtcgcttcc 


420 
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aggagaagga tgaggaaggc aafcttattag actcttactt tgtagttaag aggcatacta 480 
tgtctaccta ccaacatgaa gagactattt ataact 516 

<210> 116 
<211>" 366 
<212> DNA 

<213> Severe acute respiratory syndrome virus 

. ■ 

<400> 116 . ■ 

^ccacttatt aagtgggatt tgctgaaata tgattttacg gaagagagac tttgtctctt 60 

m 

• . t • ■ • 

cgaccgttat tttaaatatt gggaccagac ataccatccc aattgtatta a.ctgtttgga 120 
tgataggtgt atccttcatt gtgcaaactg taatgtgtta ttttctgctg tgtttccacg 180 
tacaagtttt ggaccactag taagaaaaat atttgtagat ggtgttcctt ttgttgtttc 240 

« 

aactggatac cattttcgtg agttaggagt cgtacataat caggatgtaa acttacatag 300 

ctcgcgtctc agtttcaagg aacttttagt gtatgctgct gatccagcta tgcat'gcagc '360 

ttctgg -366 

<210> 117 

<211> 291 ■ . . 

<212> DMA 

<213> Severe acute respiratory syndrome virus ' 
<400> ii7 

tgaaaaagtt gctggttttg caaagttcct aaaaacta.at tgbtgtcgct tccaggagaa . 60 
ggatgaggaa ggcaatttat tagactctta ctttgtagtt aagaggcata ctatgtctaa 120 

• ■ 

ctaccaacat gaagagacta tttataactt ggttaaagat tgtccagcgg ttgctgtcca 180 
tgactttttc aagtttagag tagatggtga catggtacca catatatcac gtcagcgtct 240 
aactaaatac acaatggctg atttagtcta tgctctacgt cattttgatg a • • 291 



<210> 118 
<211> 480 
<212> DNA 

<213> Severe acute respiratory syndrome virus 
<400> 118 

gagtcccata tggatgctga tctcgcaaaa ccacttatta agtgggattt gctgaaatat 60 
gattttacgg aagagagact ttgtctcttc gaccgttatt ttaaatattg ggaccagaca . IZO 
taccatccca attgtattaa ctgtttggat gataggtgta tccttcattg tgcaaacttt 180 
aatgtgttat tttctactgt gtttccacct acaagttttg gaccactagt aagaaaaata 240 
tttgtagatg gtgttccttt tgttgtttca actggatacc attttcgtga gttaggagtc 300 
gtacataatc aggatgtaaa cttacatagc tcgcgtctca gtttcaagga acttttagtg 360 
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■ 

m 

* 

* 

tatgctgctg atccagctat gcatgcagct tctggcaatt tattgct^ga taaacgcact 420 
acatgctttt cagtagctgc actaacaaac aatgttgctt ttcaaactgt caaacccggt 480 

• • • 

<210> • 119 . • ' " 

<211> 405 i 

<212> DNA • . * . |- 

<213> Severe acute respiratory syndrome virus 

<400> 119- 

aatgggaact ggtacgattt cggtgatttc gtacaagtag .caccaggctg cggagttcct 60 

attgtggatt catattactc attgctgatg qfccatcctca ctttgactag ggcattggct 120 

gctgagtccc atatggatgc tgatctcgca aaaccactta ttaagtgaga tttgctgaaa 180 

tatgatttta cggaagagag actttgtctc ttcgaccgtt attttaaata ttgggaccag 240 

acataccatc ccaattgtat taactgtttg gat'gataggt gtatccttca ttgtgcaaac 300 

ttta'atgtgt tattttctac tgtgtttcca cctacaagct ttggaccact agtaagaaaa 360 

atatttgtag atggtgttcc ttttgttgtt tcaactggat accat 405 

<210> 120 • 

<211> 562 ' 

<212> DHA 

<213> Severe acute respiratory syndrome virus 

• I * • - , 

<220> 

<221> mlsc^feature 

<222> (67).. (67} 

<223> n is a, c, g, or t* 

<400> 120 

ctattgatgc ttacccactt acaaaacatc ctaatcagga gtatgctgat gtctttcact 60 

tgtattnaca atacattaga aagttacatg atgagcttac tggccacatg ttggacatgt 120 

attccgtaat gctaactaat gataacacct cacggtactg ggaacctgag ttttatgagg ' ISO 

ctatgtacac accacataca gtcttgcagg. ctgtaggtgc ttgtgtattg tgcaattcac 240 

agacttcact tcgttgcggt gcctgtatta ggagaccatt cctatgttgc aagtgctgct 300 

* 

atgaccatgt catttcaaca tcacacaaat tagtgttgtc tgttaatccc tatgtttgca 360 

atgccccagg ttgtgatgtc actgatgtga cacaactgta tctaggaggt atgagctatt 420 

attgcaagtc acataagcct cccafctagtt ttccattatg tgctaatggt caggtttttg 480 

gtttatacaa aaacacatgt gtaggcagtg acaatgtcac tgacttcaat gcgatagcaa 540 

catgtgattg gactaatgct gg 562 



<210> 121 
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I ■ 

<211> 580 . • ' • 

<212>' DNA . . »' 

<213> Severe acut^. Respiratory syndrome, virus 

* • 

<400> 121 * ' • ' 
gctatgtaca caccacatac agtcttgcag gctgtaggtg cttgtgtatt gtgcaattca 60 

cagacttcac ttcfcfttgcgg tgcctgtatt aggagaccat tcctatgttg caagtigctgc 120 

tatgaccatg tcatttcaac atcacacaaa ttagtgt.tgt ctgttaatcc ctatgtttgc 180 

aatgccccag gttgtgatgt cactgatgtg acacaactgt atctaggagg tatgagctat 240 

* 

t ■ m 

tattgcaagt cacataagcc tcccattagt tttccattat gtgctaatgg ;tcaLggttttt . 300 
ggtttataca aaaacacatg tgtaggcagt gacaatgtca ctgacttcaa tgcgatagca 360 

• 4 

acatgtgatt ggacta^tgc tggcgattac atacttgcca acacttgtac tgagagactc 420 

» • 
aagcttttcg cagcagaaac gctcaaagcc actgaggaaa catttaagct gtcajbatggt 480 

attgccactg tacgcgaagt actctctgac agagaattgc atctttcatg ggaggttgga 540 

aaacctsgac cciccattgaa cagaaactat gtctttactg 5*80 

• 

.<210> 122 
<211> 610 

<2l2> DNA • 
<213> Severe acute respiratory syndrome virus 

* • * • • " 

<40oV' ^7-2 

tggtgatgct gttgtgtaca gaggtactac gacatacaag ttgaatgttg gtgafctactt 60 

* 

tgtgttgaca tctcacactg taatgccact tagtgcacct actctagtgc cacaagagca 120 

ctatgtgaga attactggct' tgtacccaac actcaacatc tcagatgagt tttctagcaa 180 

tjgttgcaaat tatcaaaagg tcggcatgca aaagtactct acactccaag gaccacctgg 240 

tactggtaag agtcattttg ccatcggact tgctctctat tacccatctg ctcgcatagt 300 

gtatacggca tgctctcatg. cagctgttga tgccctatgt gaaaaggcat taaaatattt 360 

gcccatagat aaatgtagta gaatcatacc tgcgcgtgcg cgcgtagagt gttttgataa 420 

attcaaagtg. aattcaacac tagaacagta tgttttctgc actgtaaatg cattgccaga 480 

aacaactgct gacattgtag tctttgatga aatctctatg gctactaatt atgacttgag . • 540 

* 

tgttgtcaat gctagacttc gtgcaaaaca ctacgtctat attggcgatc ctgctcaatt 600 

• « 

accagcccct 610 



<210> 123 
<2n> 429 
<212> DNA 

<213> Severe acute respiratory syndrome virus 
<400> 123 

> 
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ccaacactca acatctcaga tgagttttct agcaatgttg caaattatca aaaggtcggc * '60 

- 

atgcaaaagt actctacact ccaaggacca cctggtactg gtaagagtca ttttgccatc 120 

ggacttgctc tctattaccc atctgctcgc atagtgtata cggcatgctc tcatgcagct 180 

• ^ * • • ■ 

gttgatgccc tatgtgaaaa ggcattaaaa tatttgccca tagataaatg tagtagaatc 240 
atacctgcgc gtgcgcgcgt agagtgtttt gataaattca aagtgaattc |aacactagaa 300. 
cagtatgttt tctgcactgt aaatgcattg ccagaaacaa ctgctgacat tgtagtcttt 360 

* 

gatgaaatct ctatggctac taattatgac ttgagtgttg tcaatgctag acttcgtgca 420 
aaacactac 429 

m 

m • 

<210> 124 • ' ' 

'<211> 486 

<2i2> vm ' • 

<213> Severe acut,© respiratory syndrome virus 

■ 

• ■ 

<400> 124 

•caatgtggct atcacaaggg -caaaaattgg cattttgtgc ataatgtctg atagagatct 60 
ttatgacaaa ctgcaattta caagtctaga aataccacgt cgcaatgtgg ctacattaca * 120 
agcagaaaat gtaactggac tttttaagga ctgtagtaag atcattactg gtcttcatcc 180 

■ 

* 

tacacaggca cctacacacc tcagcgttga tataaagttq aagactgaag gattatgtgt 240 

tgacatacca ggcataccaa a<392kC3it9ao ctaccgtaga ctcatctcta tgatgggttt 300 

caaaatgast taccaagtca atggttaccc taatatgttt atcacccgcg aagaagctat 360 

tcgtcacgtt cgtgcgtgga ttggctttga tgtagagggc tgtcatgcaa ctagagatgc 420 

t ■ 

■ * 

tgtgggtact aacctacctc tccagctagg.attttctaca ggtgttaact tagtagctgt 480 
accgac 486 

<2I0> 125 

<211> 427- 

<212> DNA 

'<213> Severe acute respiratory syndrome virus 

■ 

<400> 125 ' 

aaaggacatg acctaccgta gactcatctc tatgatgggt ttcaaaatga attaccaagt 60 
caatggttac cctaatatgt ttatcacccg cgaagaagct attcgtcacg ttcgtgcgtg . 120 

gattggcttt gatgtagagg gctgtcatgc aactagagat gctgtgggta ctaacctacc 180 

tctccagcta ggattttcta caggtgttaa cttagtagct gtaccgactg gttatgttga 240 

cactgaaaat aacacagaat tcaccagagt taatgcaaaa cctccaccag gtgaccagtt 300 

taaacatctt ataccactca tgtataaaqg cttgccctgg aatgtagtgc gtattaagat 360 

agtacaaatg ctcagtgata cactgaaagg attgtcagac agagtcgtgt tcgtcctttg 420 
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•ggcgcat - . / , 427 

' . J • • • 

<210> 126 
<211> * 392 ■ 
<212> DNA ' 

<213>*^ Severe ^cute respiratory syndrome virus , 
<400> 126' ... ' 

• atggaaatgc acatgtggct agttgtgstg ctatcatgac tagatgttta* gcagtccatg 60 
^gtgctttgt taagcgcgtt gattggtctg ttgaataccc tattatagga gatgaactga ' 120 

gggttaattc tg'cttgcaga aaagtacaac acatggttgt gaagtctgca ttgcttgctg IBO 

• • . 

ataagtttcc agttcttcat gacattggaa atccaaaggc tatcaagtgt gtgcctcagg 240 

ctgaagtaga atggaagttc tacgatgctc agccatgtag tgacaaagct tacaaaatag 3.00 

aggaactctt ctattcttat gctacacatc acgataaatt cactgatggt gtttgtttgt 360 

• • • 

tttggaattg taaigttg'at cgttacccag cc ' . ' -392 

<210> 127 . 

<211> 483 . 

<212> DNA 

<213> Severe acute respiratory syndrome virus • 

■ • 

<400>, 127 

gcttpatcaq atacttat^c ct^c^^g^a-t cattctgtgg gttttgacta tgtctataac -60 

ccatttatga ttgatgtt<?a gcagtggggc tttacgggta accttcagag taaccatgac 120 

caacattgcc aggtacatgg aaatgcacat gtggctagtt gtgatgctat catgactaga 180 

tgtttagcag tccatgagtg ctttgttaag cgcgttgatt ggtctgttga ataccctatt. 240 

• ■ 

9 

ataggagatg aactgagggt taattctgct tgcagaaaag tacaacacat ggttgtgaag 300 

tctgcattgc ttgctgataa gtttccagtt cttcatgaca ttggaaatcc aaaggctatc 360 

aagtgtgtgc ctcaggctga agtagaatgg aagttctacg atgctcagcc atgtagtgac 420 

aaagcttaca aaatagagga actcttctat tcttatgcta cacatcacga taaattcact 480 
gat ■ • * , 4ft3 

* • 

<210> 128 • • 
<211> 326 
<212> DNA 

<213> Severe acute respiratory, syndrome virus 
<400> 128 

tcaaagggac cagcacaagc tagcgtcaat ggagtcacat taattggaga atcagtaaaa 60 
acacagttta actactttaa gaaagtagac ggcattattc aacagttgcc tgaaacctac 120 
tttactcaga gcagagactt agaggatttt aagcccagat cacaaatgga aactgacttt 180 
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btcgagctcg ctatggatga attcatacag cgatataagc- tcgagggcta tgccttcgaa 240 

• • ■ - 

cacatcgttt atggagattt cagtcatgga caacttggcg gtcttcettt aatgataggc 300 

• ■ 

ttagccaagc gctcacaaga ttcact 326 

<210> 129 . . I 

<211> 457 . ' . 

<212> vm. ' ' 

<213> Severe acute respiratory syndrome, virus 

<400> 129 

acaccttcaa agggaccagc acaagctagc gtcaatggag tcacattaat tggagiaatca " 60 

gtaaaaacac agtttaacta ctttaagaaa gtagacggca ttattcaaca gttgcctgaa 120 

acctacttta ctcagagcag agacttagag gattttaagc ccagatcaca aatggaaact ISO 

gactttctcg agctcgctat ggatgaattc atacagcgat ataagctcga gggctatgcc 240 

■ 

ttcgaacaca tcgtttatgg agatttcagt catggacaac ttggcg^tct tcatttaatg '300 

ataggcttag ccaagcgctc acaagattca ccacttaaat tagagg^ttt tatccctatg 360 

gacagcacag tgaaaaatta cttcataaca gatgcgcaaa caggtt^atc aaaatgtgtg 420 

tg^tctgtga ttgatctttt acttgatgac tttgtcg . • 457 



<noV' "i^^^O • . . . 

<211> 493 . ' ^ ■ 

<212> DMA ' ■ 

<213> Severe acute respiratory syndrome viros 

<400> 130 , • 

cgcaaagtat actcaactgt gtcaatactt aaatacactt actttagctg taccctacaa 60 

catgagagtt attcactttg gtgctggctc tgataaagga gttgcaccag gtacagctgt 120 

gctcagacaa tggttgc^aa ctggcacact acttgtcgat tcagatctta atgacttcgt 180 

ctccgacgca gattctactt taattggaga ctgtgcaaca gtaca,tacgg ctaataaatg 240 

ggaccttatt attagcgata tgtatgaccc taggaccaaa ' catgtgacaa aagagaatga 300 

• ' . - 

ctctaaagaa gggtttttca cttatctgtg tggatttata' aagcaaaaac tagccctggg 360 

tggttctata gctgtaa^ga taacagagca ttcttggaat gctgaccttt acaagcttat 420 

gggccatttc tcatggtgga csgcttttgt tacaaatgta aatgcatcat catcggaagc 480 

atttttaatt ggg 493 



<210> 131 

<211> 490 

<212> DNA 

<213> Severe acute respiratory syndrome virus 
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i • • •• 

■ 

<400> 131 . * • . 

acttaaatac acttacttta gctgtaccct acaacatgag agttattcac tttggtgctg 60 

gctctgataa aggagttgca ccaggtacag ctgtgctcaq acaatggttg ccaactggca 120 

cactacttgt. -cgattcagat cttaatgact tcgtctccga cgcagattct actttaattg 180 

gagactgtgc aac^^acat acggctaata aatgggacct tattattage gatatgtatg 240 

« 

accctaggac caaacatgtg acaaaagaga atgactctaa agaagggttt ttcacttatc 300 
tgtgtggatt tataaagcaa aaactagccc tgggtggttc tatagctgta aagataacag ' 360 

4 

agcattcttg gaatgctgac ctttacaag;c ttatgggcca tttctcatgg tg^gacagctt 420 

ttgttacaaa tgtaaatgca tcatcatcgg aagcattttt ■ aattggggct aactatcttg * 480 

- gcaagccgaa 490 

<210> 132 . ' 

<211> 550 

<212> DNA • ' 

<213> Severe acute respiratory syndrome virus 

• ■ 

<400> 132 

taaggagaat caaatcaatg atatgattta ttctcttctg gaaaaaggta ggcttatcat 60 

tagagaaaac aacagagttg tggtttcaag tgatattctt gttaacaact aaacgaaqat 120 

gtttattttc ttattatttc ttactctcac tagtggtagt gaccttgacc ggtgcaccac 180 

ttttgatgat gttcaagctc ctaattacac tcaacatact tcatctatga ggggggttta 240 

ctatcctgat gaaattttta gatcagacac tctttattta actcaggatt tatttcttcc 300 

m 

H 

attttattct aatgttacag ggtttcatac tattaatcat acgtttggca accctgtcat 360 

accttttaag gatggtattt attttgctgc cacagagaaa tcaaatgttg tccgtggttg 420 

ggtttttggt tctaccatga acaacaagtc acagtcggtg attattatta acaattctac 480 

taatgttgtt atacgagcat gtaactttga attgtgtgac aaccctttct ttgctgtttc 540 

taaacccata 550 

<210> 133 • * ■ 

<211> 490 

<212> DNA • ' 

<213> Severe acute respiratory syndrome virus 
<400> 133 

acttaaatac acttacttta gctgtaccct acaacatgag agttattcac tttggtgctg 60 
gctctgataa aggagttgca ccaggtacag ctgtgctcag acaatggttg ccaactggca 120 
cactacttgt cgattcagat cttaatgact tcgtctccga cgcagattct actttaattg 180 
gagactgtgc aacagtacat acggctaata aatgggacct tattattage gatatgtatg 240 

200 



wo 2004/096842 PCT/CA2004/000626 

accctaggac caaacatgtg acaaaagaga .atgactctaa agaagggttt ttcacttatc 300 

tgtgtggatt tataaagcaa ^aactagccc tgggtggttc tatagctgta aagataacag 360 

agcattcttg gaatgctgac ctttacaagc ttatgggcca tttctcatgg tggacagctt 420 

ttgttacaaa tgtaaatgca tcatcatcgg aagcattttt aattggggct aactatcttg 480 
gcaagccgaa | 490 



<210> 134 

<211> 550 

<212> DNA 

<213> Severe acute respiratory syndrome virus . 

« . ■ 

<400> 134 

taaggagaat caaatcaatg atatgattta ttctcttctg gaaaaaggta ggcttatcat 60 

* • 

tagagaaaac aacagagttg tggtttcaag tgatattctt gttaacaact aaacgaacat 120 

gtttattttc ttattatttc ttactctcac tagtggtagt gaccttgacc ggtgcaccac 180 

ttttgatgat gttcaagctc ctaattacac tcaacatact tcatctatga ggggggttta 240 . 

ctatcctgat gaaattttta gatcagacac tctttattta actcaggatt tatttcttcc 300 

attttattct .aatgttacag ggtttcatac tattaatcat acgtttggca accctgtcat. 360 

accttttaag gatggtattt attttgctgc cacagagaaa tcaaatgttg tccgtggttg 420 

♦ 

gqttVttg^t tetaocatga acaacaagtc acagtcggtg attattatta acaattctac 480 

taatgttgtt atacgagcat gtaactttga attgtgtgac aaccctttct ttgctgtttc • 540 

taaacccata 550 



<210> 135 
<211>' 400 
<212> DNA 

<213> Severe acute respiratory syndrome virus 
<400> 135 

atcaatgata tgatttattc tcttctggaa aaaggtaggc ttatcattag agaaaacaac 60 
agagttgtgg tttcaagtga tattcttgtt aacaactaaa cgaacatgtt tattttctta 120 
ttatttctta ctctcactag tggtagtgac cttgaccggt gcaccacttt tgatgatgtt • 180 
caagctccta attacactca acatacttca tctatgaggg gggtttacta tcctgatgaa 240 
atttttagat cagacactct ttatttaact caggatttat ttcttccatt ttattctaat 300 
gttacagggt ttcatactat taatcatacg tttggcaacc ctgtcatacc ttttaaggat 360 
ggtatttatt ttgctgccac agagaaatca aatgttgtcc 400 



<210> 136 
<211> 288 



201 



wo 2004/096842 . PCT/CA2004/000626 
<212> DNA 

<213> Severe acute respiratory syndrome virus 
<400> 136 

tgatctttgc ttctccaatg tctatgcaga ttctttggta gtcaagggag atgatgtaag 60 

acaaatagcg ccaggacaaa ctggtgttat tgctgattat aattataaat tgccagatga 120 

tttcatgggt tgtgtccttg cttggaatac taggaacatt gatgctactt caactggtaa 180 

■ * * 

ttataattat aaatataggt atcttagaca tggcaagctt aggccctttg agagagacat 240 

atctaatgtg cctttctcca cctgatggca aaccttgcac cccacctg 288* 



<210> 137 
<211> 411 
<212> DNA 

<213> Severe acute respiratory syndrome virus 
<400> 137 

ctttgagaga gacatatcta atgtgccttt ctcccctgat ggcaaacctt gcaccccacc 60 
tgctcttaat tgttattggc cattaaatga ttatggtttt tacaccacta ctggcattgg 120 
ctaccaacct tacagagttg tagtactttc ttttgaactt ttaaatgcac cggccacggt 180 

m 

ttgtggacca aaattatcca ctgaccttat taagaaccag tgtgtcaatt ttaattttaa 240 

tggactcact ggtactggtg tgttaactcc ttcttcaaag agatttcaac catttcaaca 300 

aattttgccg tgatgt-ttcl gatttcactg attccgttcg agatcctaaa acatctgaaa 360 

tattagacat ttcaccctgc gcttttgggg gtgtaagtgt aattacacct g . 411 ' 

.. , • . . 

<210> 138 ' 
<211> 357 

<212> DNA 

<213> Severe acute respiratory syndrome virus 
<:400> 138 

tggaaatatt ttggtggttt taatttttca caaatattac ctgaccctct aaagccaact 60 
aagaggtctt ttattgagga cttgctcttt aataaggtga cactcgctga tgctggcttc 120 
atgaagcaat atggcgaatg cctaggtgat attaatgcta gagatctcat ttgtgcgcag 180 
aagttcaatg gacttacagt gttgccacct ctgctcactg atgatatgat tgctgcctac 240 
actgctgctc tagttagtgg tactgccact gctggatgga catttggtgc tggcgctgct 300 
cttcaaatac cttttgctat gcaaatggca tataggttca atggcattgg agttact 357 



<210> 139 

<211> 434 

<212> mA 

<213> Severe acute respiratory syndrome virus 

<400> 139 
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caatatggcg 


aatgcctagg tgatattaat gctagagatc tcatttgtgc 


gcagaagttc 


60 


aatggactta cagtgttgcc acctctgctc actgatgata tgattgctgc 


ctacactgct 


120 


gctctagtta gtggtactgc cactgctgga tggacatttg gtgctggcgc 


* 

tgctcttcaa 


180 


ataccttttg ctatgcaaat ggcatatagg ttcaatggca ttggagttac 


ccaaaatgtt ' 
• 


240 


ctctatgaga 


accaaaaaca aatcgccaac caatttaaca aggcgattag 




300 


gaatcactta 


caacaacatc aactgcattg ggcaagctgc aagacgttgt 


taaccagaat 


360 


gctcaagcat 


taaacacact tgttaaacaa cttagctcta attttggtgc 


aatttcaagt 


420 


gtgctaaatg atat 


• 


434 


<210> 140' 
<211> 557 
<212> DNA 

<213> Severe acute respiratory syndrome virus 




■ 


<400> 140 
acagacaat-a 


catttgtctc aggaaattgt gatgtcgtta ttggcatcat 


taacaacaca 


• 

60 . 


gtttatgatc 


ctctgcaacc tgagcttgac tcattcaaag aagagctgga 


caagtacttc 


120 . 


aaaaatcata 


catcaccaga tgttgatctt ggcgacattt caggcattaa 


cgcttctgtc 


180 


gtcaacattc 


aaaaagaaat tgaccgcctc aatgaggtcg ctaaaaattt 


aaatgaatca 


240 


<. 

ctca1;tgapc 


ttcaagaatt gggaaaatat gagcaatata ttaaatggcc 


ttggtatgtt 


300 


tggctcggct 


tcattgctgg actaattgcc atcgtcatgg ttacaatctt 


gctttgttgc 

• 


360 


atgactagtt 


gttgcagttg cctcaagggt gcatgctctt gtggttcttg 

• 


V*^ rrf a a rt ^ 4* 4" 


420 

V b w 


gatgaggatg 


actctgagcc agttctcaag ggtgtcaaat tacattacac 


ataaacgaac 


480 


• 

ttatggattt 


gtttatgaga ttttttactc ttagatcaat tactgcacag 


ccagtaaaaa 


540 


ttgacaatgc 


ttctcct 




• 

557 


<210> 141 
<211> 530 
<212> DNA 

<213> Severe acute respiratory syndrome virus 


• 

• 




<400> 141 

atgtttggct cggcttcatt gctggactaa ttgccatcgt catggttaca 


atcttgcttt 


60 

* 


gttgcatgac 


tagttgttgc agttgcctca agggtgcatg ctcttgtggt 


tcttgctgca - 


120 


agtttgatga 


ggatgactct gagccagttc tcaagggtgt caaattacat 


tacacataaa 


160 


cgaacttatg gatttgttta tgagattttt tactcttaga tcaattactg 


cacagccagt 


240 


aaaaattgac 


aatgcttctc ctgcaagtac tgttcatgct acagcaacga 


taccgctaca 


300 


agcctcactc cctttcggat ggcttgttat tggcgttgca tttcttgctg 


tttttcagag 


360 
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« I 



cgctaccaaa ataattgcgc tcaataaaag atggcagcta gccctttata agggcttcca 420 
gttcatttgc aatttactgc tgctatttgt taccatctat tcacatcttt tgcttgtcgc 480 
tgcaggtatg gaggcgcaat ttttgtacct ctatgccttg atatattttc 530 



<210> 142 
<211> 320 
<212> DNA 

<213> Severe acute respiratory syndrome virus 
<400> 142 

ttgctcgtac ccgctcaatg tggtcattca acccagaaac aaacattctt ctcaatgtgc 60 

■ 

* 

ctctccgggg gacaattgtg accagaccgc tcatggaaag tgaacttgtc attggtgctg 120 
tgatcattcg tggtcacttg cgaatggccg gacactccct agggcgctgt gacattaagg 180 

m 

acctgccaaa agagatcact gtggctacat cacgaacgct ttcttattac aaattaggag . 240 

cgtcgcagcg tgtaggcact gattcaggtt ttgctgcata caaccgctac cgtattggaa 300 

actataaatt aaatacagac 320 



<210> 143 
<211> 417 
<212> DNA 

<213> Severe acute respiratory syndrome virus 
<400> 143 

cgaacttatg tactcattcg tttcggaaga aacaggtacg ttaatagtta atagcgtact 60 
tctttttctt gctttcgtgg tattcttgct agtcacacta gccatcctta ctgcgcttcg 120 
attgtgtgcg tactgctgca atattgttaa cgtgagttta gtaaaaccaa cggtttacgt 180 
ctactcgcgt gttaaaaatc tgaactcttc tgaaggagtt cctgatcttc tggtctaaac 240 
gaactaacta ttattattat tctgtttgga actttaacat tgcttatcat ggcagacaac 300 
ggtactatta* ccgttgagga gcttaaacaa ctcctggaac aatggaacct agtaataggt 360 
ttcctattcc tagcctggat tatgttacta caatttgcct attctaatcg gaacagg 417 



<210> 144 
<211> 516 
<212> DNA 

<213> Severe acute respiratory syndrome virus 
<400> 144 

cttgtcattg gtgctgtgat cattcgtggt cacttgcgaa tggccggaca ctccctaggg 60 
cgctgtgaca ttaaggacct gccaaaagag atcactgtgg ctacatcacg aacgctttct 120 
tattacaaat taggagcgtc gcagcgtgta ggcactgatt caggttttgc tgcatacaac 180 
cgctaccgta ttggaaacta taaattaaat acagaccacg ccggtagcaa cgacaatatt 240 
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gctttgctag tacagtaagt gacaacagat gtttcatctt gttgacttcc aggttacaat 300 

agcagagata ttgattatca ttatgaggac tttcagqatt gctatttgga atcttgacgt 360 

tataataagt tcaatagtga gacaattatt taagcctcta actaagaaga attattcgga 420 

gttagatgat gaagaaccta tggagttaga ttatccataa aacgaacatg aaaattattc 480 

I 

tcttcctgac attgatttta tttacatctt gcgagc 516 

<210> 145 • 
<211> 310 
<212> Dl?A 

<213> Severe acute respiratory syndrome virus 
<400> 145 

cgatgtttca tcttgttgac ttccaggtta caatagcaga gatattgatt atcattatga 60 

ggactttcag gattgctatt tggaatcfctg acgttataat aagttcaata gtgagacaat 120 . 

tatttaagcc tctaactaag aagaattatt cggagttaga tgatgaagaa cctatggagt 180 

tagattatcc ataaaacgaa catgaaaatt attctcttcc tgacattgat tgtatttaca 240 

tcttgcgagc tatatcacta tcaggagtgt gttagaggta cgactgtact actaaaagaa 300 

ccttgcccat 310 

<210> 146 . 

<211> 556 . . 

<212> DNA 

<213> Severe acute respiratory syndrome virus 
<400> 146 

.agaaagacag aatgaatgag ctcactttaa ttgacttcta tttgtgcttt ttagcctttc 60 

tgctattcct tgttttaata atgcttatta tattttggtt ttcactcgaa atccaggatc 120 

tagaagaacc ttgtaccaaa gtctaaacga acatgaaact tctcattgtt ttgacttgta 180 

tttctctatg cagttgcata tgcactgtag tacagcgctg tgcatctaat aaacctcatg 240 

tgcttgaaga tccttgtaag gtacaacact aggggtaata cttatagcac tgcttggctt 300 

tgtgctctag gaaaggtttt accttttcat agatggcaca ctatggttca aacatgcaca 360 

cctaatgtta ctatcaactg tcaagatcca gctggtggtg cgcttatagc taggtgttgg 420 

taccttcatg aaggtcacca aactgctgca tttagagacg tacttgttgt tttaaataaa ' 480 

cgaacaaatt aaaatgtctg ataatggacc ccaatcaaac caacgtagtg ccccccgcat 540 

tacatttggt ggaccc 556 



<210> 147 
<211> 110 
<212> DNA 
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<213> Severe acute respiratory syndrome vj-rus 
<400> 147 

acgaacatga aaalitattct cttcctgaca ttgattgtat ttacatcttg cgagctatat 60 
cactatcagg agtgtgttag aggtacgact gtactactaa aagaaccttg 110 



<210> 148 
<21X> 363 
<212> DNA • 

<213> Severe acute respiratory syndrome virus 
<400> 148 

gcatttagag acgtacttgt . tgttttaaat aaacgaacaa attaaaatgt ctgataatgg 60 
acctcaatca agccaacgta gtgccccccg cattacattt ggtggaccca cagattcaac 120 
tgacaataac cagaatggag gacgcaatgg ggcaaggcca aaacagcgcc gaccccaagg 180 
tttacccaat aatactgcgt cttggttcac agctctcact cagcatggca aggaggaact 240 
tagattccct cgaggccagg gcgttccaat caacaccaat agtggtccag atgaccaaat 300 
tggctactac cgaagagcta cccgacgagt tcgtggtggt gacggcaaaa tgaaagagct 360 
cag 363 



<210> 149 
<211> 294 
<212>' 0\^iK 

<213> Severe acute respiratory syndrome virus 
<A00> 149 

ctatcagctg cgtgcaaga,t cagtttcacc aaaacttttc atcagacaag aggaggttca 60 
acaagagctc tactcgccac tttttctcat tgttgctgct ctagtatttt taatactttg 120 
cttcaccatt aagagaaaga cagaatgaat gagctcactt taattgactt ctatttgtgc 180 
tttttagcct ttctgctatt ccttgtttta ataatgctta ttatattttg gttttcactc 240 
gaaatccagg atctagaaaa accttgtacc aaaggctaaa cgaacatgaa actt 294 



<210> 150 
<211> 504 
<212> DNA 

<213> Severe acute respiratory syndrome virus 
<400> 150 

caaactgctg catttagaga cgtacttgtt gtttaaataa acgaacaaat taaaatgtct 60 
gataatggac cccaatcaaa ccaacgtagt gccccccgca ttacatttgg tggacccaca 120 
gattcaactg acaataacca gaatggagga cgcaatgggg caaggccaaa acagcgccga 180 
ccccaaggtt tacccaataa tactgcgtct tggttcacag ctctcactca gcatggcaag 240 
gaggaactta gattccctcg aggccagggc gttccaatca acaccaatag tggtccagat 300 
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gaccaaattg gctactaccg aagagctacc cgacgagttc gtggtggtga cggcaaaatg 360 

aaagagctca gccccagatg gtacttctat tacctaqgaa ctggcccaga agcttcactt 420 

ccctacggcg ctaacaaaga aggcatcgta tgggttgcaa ctgagggagc cttgaataca 480 

cccaaagacc acattggcac ccgt 504 

I 

<210> 151 
<211> 474 
<212> DNA . 

<213> Severe acute respiratory syndrome virus 
<400> 151 

ctcgccactt tttctcattg ttgctgctct agtattttta atactttgct tcaccattaa 60 

gagaaagaca gaatgaatga gctcacttta attgacttct atttgtgctt tttagccttt 120 
ctgctattcc ttgttttaat aatgcttatt atattttggt tttcactcga aatccaggat . 180 

ctagaagaac cttgtaccaa agtctaaacg aacatgaaac ttctcattgt tttgacttgt 240 

atttctctat gcagttgcat atgcactgta gtacagcgct gtgcatctaa taaacctcat 300 

gtgcttgaag atccttgtaa ggtacaacac taggggtaat acttatagca ctgcttggcf 360 

ttgtgctcta ggaaaggttt taccttttca tagatggcac actatggttc aaacatgcac 420 . 

acctaatgtt actatcaact gtcaagatcc agctggtggt gcgcttatag ctag 474 



<210> 152 
<211> 516 
<212> DNA 

<213> Severe acute respiratory syndrome virus 
<400> 152 

cattaagaga aagacagaat gaatgagctc actttaattg acttctattt gtgcttttta 60 
gcctttctgc tattccttgt tttaataatg cttattatat tttggttttc actcgaaatc 120 
caggatctag aagaaccttg taccaaagtc taaacgaaca tgaaacttct cattgttttg 180 
acttgtattt ctctatgcag ttgcatatgc actgtagtac agcgctgtgc atctaataaa 240 
cctcatgtgc ttgaagatcc ttgtaaggta caacactagg ggtaatactt atagcactgc 300 
ttggctttgt gctctaggaa aggttttacc ttttcataga tggcacacta tggttcaaac 360 
atgcacacct aatgttacta tcaactgtca agatccagct ggtggtgcgc ttatagctag 420 
gtgttggtac cttcatgaag gtcaccaaac tgctgcattt agagacgtac ttgttgtttt 480 
aaataaacga acaaattaaa dtgtctgata atggac 516 



<210> 153 
<211> 451 
<212> DNA 
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» 

<213> Severe acute respiratory syndrome virus 



<400> 153 



ccaaggttta ccc^ataata ctgcgtcttg gttcacagct ctcactcagc 

4 m 

• 


atggcaagga 

* 


60 


ggaacttaga ttccctcgag gccagggcgt tccaatcaac accaatagtg 


gtccagatga 


120 


ccaaattggc- tactalccgaa gagctacccg acgagttcgt ggtggtgacg 


gcaaaatgaa 


180 

• 


agagctcagc cccagatggt acttctatta cctaggaact ggcccagaag 


cttcacttcc 


240 


ctacggcgct aacaaagaag gcatcgtatg ggttgcaact gagggagcct 


tgaatacacc 


300 

• 


• 

caaagaccac attgacaccc gcaatcctaa taacaatgct gccaccgtac 


tacaacttcc . 


360 


1 

tcaaggaaca acattgccaa aaggcttcta cgcagaggga agcagaggcg 


gcagtcaagc 


420 


ctcttctcgc tcctcatcac gtagtcgcgg t 




451 


<210> 154 
<2il> 495 

<212> DNA * ' 

<213> Severe acute respiratory syndrome virus 

> 


■ 

• 




<400> 154 
gatgaagctc 


agcctttgpc gcagagacaa aagaagcagc ccactgtgac 


tcttcttcct 


60 


gcggctgaca 


• 

tggatgattt ctccagacaa cttcaaaatt ccatgagtgg 

■ 


agcttctgct • 


120 

• 


gattcaactc 


aggcataaac actcatgatg accacacaag gcagatgggc 

• 


tatgtaaacg 


180 


ttttcgcaat 


tccgtttacg atacatagtc tactcttgtg cagaatgaat 


tctcgtaact 


240 


aaacagcaca 


agtaggttta gttaacttta atctcacata gcaatcttta 


atcaatgtgt 


300 


aacattaggg 


aggacttgaa agagccacca cattttcatc gaggccaegc 


ggagtacgal 


360 


9gagggtaca 


gtgaataatg ctagggagag ctgcctatat ggaagagccc 


taatgtgtaa 


420 


aattaatttt 


agtagtgcta tccccatgtg attttaatag cttcttagga 


gaatgacaaa 


480 


aaaaaaaaaa 


aaaaa 




495 



<210> 155 
<211> 512 
<212> DNA 

<213> Severe acute. respiratory syndrome virus 
<400> 155 . 

acaaggccaa actgtcacta a'gaaatctgc tgctgaggca tctaaaaagc ctcgccaaaa 60 
acgtactgcc acaaaacagt acaacgtcac tcaagcattt gggagacgtg gtccagaaca 120 
aacccaagga aatttcgggg accaagacct aatcagacaa ggaactgatt acaaacattg 180 
gccgcaaatt gcacaatttg ctccaagtgc ctctgcattc tttggaatgt cacgcattgg 240 
catggaagtc acaccttcgg gaacatggct gacttatcat ggagccatta aattggatga 300 
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caaagatcca caattcaaag acaacgtcat actgctgaac aagcacattg acgcatacaa 360 

aacattccca ccaacagagc ctaaaaagga caaaaagaaa aagactgatg aagctcagcc 420 

tttgccgcag agacaaaaga agcagcccac tgtgactctt cttcctgcgg ctgatatgga 480 

* ■ 

tgatttctcc agacaacttc aaaattccat ga * 512 



<210> 156 ) 
<211> 442 
<212> DNA 

<213> Severe acute respiratory syndrome virus 
<400> 156 

tgtgactctt cttcctgcgg ctgatatgga tgtttctcca gacaacttca aaattccatg 60 
agtggagctt ctgctgattc aactcaggca taaacactca tgatgaccac acaaggcaga 120 
tgggctatgt aaacgttttc gcaattccgt ttacgataca tagtctactc ttgtgcagaa 180 
tgaattctcg taactaaaca gcacaagtag gtttagttaa ctttaatctc acatagcaat 240 
ctttaatcaa tgtgtaacat tagggaggac ttgaaagagc caccacattt tcatcgaggc 300 
cacgcggagt acgatcgagg gtacagtgaa taatgctagg gagagctgcc tatatggaag 360 
agccctaatg tgtaaaatta attttagtag tgctatcccc atgtgatttt aatagcttci: 420. 
taggagaatg acaaaaaaaa aa .442 



<210> 157 

<2i;l> 24 

<212> DNA 

« 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 157 • 

atgaattacc aagtcaatgg ttac 24 

<210> 158 

<211> 20 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 

<400> 158 

gaagctattc gtcacgttcg 20 



<210> 159 

<2ll> 22 

<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Primer 
<400> 159 

ctgtagaaaa tcctagctgg dg 



<210> 160 »'••' 

<211> 21 

<212> DNA 

<213> Artificial Sequence 

* 4 

<220> 

<223> Primer 
<400> 160 

cataaccagt cggtacagct a 



PCT/CA2004/000626 



22 



21 



<210> 161 

<211> 20 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 

<400> 161 . 

ttatcacccg cgaagaagct • * 20 



<210> 162 

<211> 22 . ; 

<212> DNA 

<213>. Artificial Sequence 
<220> 

<223> Primer 

* 

<400> 162 

ctctagttgc atgacagccc tc ' 22 



<210> 163 
<211> 24 
<212> DNA 

<213> Artificial Sequence 

4 

<220> 

<223> Primer 
<400> 163 

tcgtgcgtgg attggctttg atgt 24 



<210> 164 

<:211> 24 

<212> DNA 

<213> Artificial Sequence 
<220> 
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<223> Primer 



<400> 164 



gggttgggac tatcctaa^t. gtga 



<210> 
<211> 
<212> 
<213> 



Artificial Sequence 



165 

22 

DNA 



<220> 
<223> 



Primer 



<400> 165 

taacacacaa acaccatcat ca 



22 



<210> 166 

<211> 23 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 

<400> 166 

ggttg^gact atcctaagtq tga 23 



<210> 167 

<21lV* 24 . 
<212> , DNA 

<213> Artificial Sec^uervce 

<220> ' . 

<223> Primer 

<400> 167 

ccatcatcag atagaatcat cat.a 24 

<210> 168 

<211> 21 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 



<210> 169 

<2ll> 21 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 



<400> 168 

cctctcttgt tcttgctcgc a 



21 



211 
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<400> 169 

tatagtgagc cgccacacat g* 



21 



<210> 
<211> 
<212> 



170 

21 

DNA 



<213> Artificial Sequence 
<220> 

<223> Prinier 



<:220> 

<221> misc_feature 

<222> (12).. (12) 

<223> n is a, c, q, or t 



<:400> 170 

taacacacaa cnccatcatc a 



21 



<210> 171 I . ■ 

<211> 21 

<212> DNJV 

<213> Artificial Sequence 
<220> 

<223> Primer 

<400> 171 

ctaacatgct taggataat^ g 



21 



<210> 172 

<211> 21 

<212> DNA 

<213> Artificial S^q[uence 

< ■ « 

<220> 

<222> Primer 

<400> 172 

gcctctcttg ttcttgctcg c 



21 



<210> 173 

<211> 21 

<212> DNA 

<213> Artificial Sequence 

» 

<220> 

<223> Primer 
<4 00> 173 

caggtaagcg taaaactcat c 



21 



<210> 174 
<211> 17 
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<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<:400> 174 . 

tacacacctc agcgttg . ^ 17 



<210> 175 

<211> 16 

<212> DNA 

<213> Artificial Sequence 

I 

<220> 

<223> Primer 

<400> 175 

cacgaacgtg acgaat 16 

<210> 176 

<211> 20 . 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer • 

<400> 176 



<210> 177 

<211> 47 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 

<400> 177 

caggaaacag ctatgacttg catcaccact agttgtgcca ccaggtt 47 



<210> 178 

<211> 46 

<212> DNA 

<213> Artificial Sequence 



gccggagctc tgcagaattc 



20 



<220> 
<223> 



Primer 



<400> 178 

tgtaaaacga cggccagttg atgggatggg actatcctaa gtgtga 



46 



<210> 179 
<211> 20 
<212> DNA 
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<213> Artificial Sequence 

<220> • 
<223> Primer 

<400> 179 

gcataggcag tagttgcatc 20 

<210> 160 . . . 

<211> 8 

<212> PRT . 

<213> Artificial Sequence 



<220> 

<223> ATP Binding Domain 



<220> 

<221> MISC_FEATURE 

<222> (a).*(l) 

<223> Xaa A or G 

<220> I . 

<221> misc_feature 

<222> (2)., (5) 

<223> Xaa can be any naturally occurring amino acid 
<220> 

• <221> mSC FEATORE * 

<222> (6) .7(8) • 

<223> Xaa = S or T 

<400> 180 

Xaa Xaa Xaa Xaa Xaa Gly Lys Xaa 
1 .5 



<2a0> 181 
<211> 23 
<212> PRT 

<213> Severe acute respiratory syndrome virus 
<400> 161 

Trp Tyr Val Trp Leu Gly Phe lie Ala Gly Leu lie Ala He Val Met 
15 10 15 



Val Thr He Leu Leu Cys Cys 

20 



<210> 182 

<211> 16 

<212> PRT 

<213> Severe acute respiratory syndrome virus 

<400> 182 



214 
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Met Asp Leu Phe Met Arg Phe Phe Thr Leu Arg Ser He Thr Ala Gin 
1 5. 10 15 

<210> 183 

<211> 150 

<212> PRT 

<213> Severe acute respiratory syndrome virus 



<400> 183 



I 



Met Arg Cys Trp Leu Cys Trp Lys Cys Lys ser Lys Asn Pro Leu Leu 
1 5 10 . 15 



Tyr Asp Ala Asn Tyr Phe Val Cys Trp His Thr His Asn Tyr Asp Tyr 

20 25 30 



Cys He Pro Tyr Asn Ser Val Thr Asp Thr He Val Val Thr Glu Gly 
35 40 . 45 



Asp Gly He Ser Thr Pro Lys Leu Lys Glu Asp Tyr Gin He Gly Gly 
50 55 60 



Tyr Ser Glu Asp Arg His Ser Gly Val. Lys Asp Tyr' Val Val Val His 
65 70 . 75 . 80 



Gly Tyr Phe Thr Glu Val Tyr Tyr Gin Leu Glu Ser Thr Gin He Thr 

85 .90 95 



Thr Asp Thr Gly He Glu Asn Ala Thr Phe Phe He Phe Asn Lys Leu 

100 105 110 



Val Lys Asp Pro Pro Asn Val Gin He His Thr He Asp Gly Ser Ser 
, 115 120 125 



Gly Val Ala Asn Pro Ala Met Asp Pro He Tyr Asp Glu Pro Thr Thr 
130 135 140 



» ? 



Thr Thr ser Val Pro Leu 
145 150 



<210> 184 
<211> 20 
<212> PRT 

<213> Severe acute respiratory syndrome virus 
<400> 184 

Met Met Pro Thr Thr Leu Phe Ala Gly Thr His He Thr Met Thr Thr 
15 10 15 
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* 

. . . , 

Val Tyr His lie 

20 - , . • . . 



<210> 185.- 

<211> • 42 .* 

<212> PRT . * 

<213> Severe. acute respiratory syndrome virus 
<400> 185 

Thr Ala Leu Arg Leu* Cys Ala Tyr Cys Cys Asn .lie Val Asn Val Ser 
1.5 10 15 



Leu Val Lys Pro Thr Val Tyr Val Tyr Ser Arg Val Lys Asn Leu Asn 

20 . ? 25 30 



Ser Ser Glu Gly Val Pro Asp Leu Leu Val 
35. 40 ■ ' 



<210>. 186 

<211> 39 

<212> PRT 

<213> Severe acute respiratory syndrome virus 

■ 

<400> 186 ■ 

I • 

Met Ala Asp Asn Gly Thr. lie Thr Val Glu Glu Leu "Lys Gin Leu Leu 
1 ' . 5 . ■ . ' 10 ■ 15 ' 



Glu Gin Trp Asn Leu Val lie Gly Phe Leu Phe Leu Ala Trp lie- Met 

20- * 25 • 30 • 



Leu Leu Gin Phe Ala Tyr Ser 
35 



<210> 187 * 

<211> 100 

<212> PRT * . 

<213> Severe, acute respiratory syndrome virus 

<400> 187 . 

Pro Leu Arg Gly Thr lie Val Thr Arg Pro Leu Met Glu Ser Glu Leu 
1 5 10 15 



Val He Gly Ala Val He He Arg Gly His Leu Arg Met Ala Gly His 

20 25 30 



Ser Leu Gly Arg Cys Asp He Lys Asp Leu Pro Lys Glu He Thr Val 
35 40 45 
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Ala Thr Ser Arg Thr Leu Ser Tyr Tyr Lys Leu Gly Ala Ser Gin Arg 
50 . . • 55 ' 60 . . 



Val Gly- Thr Asp Ser Giy Phe Ala Ala Tyr Asn Arg Tyr Arg He Gly 
65 . 70. 75 ' 80 

Asn Tyr Lys Leu Asn Thr Asp His Ala Gly Ser Asn Asp Asn tie Ala 

85 . 90 . 95 



Leu Leu Val Gin 

100 



<210> • 188 

<211> 23 

<212> PRT 

<213> Severe acute respiratory syndrome virus 

<400> 188- 

Phe Tyr Leu Cys Phe Leu Ala Phe Leu Leu Phe Leu Val Leu He Met 
1 5 • 10 . 15 



Leu He He Phe Trp Phe Ser 

20 



<210> 189 . . 

<211> 19 • 

<212> PRT 

<213> Severe acute respiratory syndrome virus 

* • 

<400:> 189- 

Leu Leu He Val Leu Thr Cys He Ser Leu Cys Ser Cys He Cys Thr 
1 5 10 15 



Val Val Gin 



<210> 190 . 
<211> 24 
<212> PRT 

<213> Severe acute respiratory syndrome virus 
<400> 190 

He Cys Thr Val Val Gin Arg Cys Ala Ser Asn Lys Pro His Val Leu 
15 10 15 



Glu Asp Pro Cys Lys Val Gin His 

20 
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. <210>* 191 , 

<211> 22 ' . . 

<212> PRT . 

<213> Severe acute respiratory syndrome virus *. 

« • 

<400> 191 ■ 

'■ • . • • * - 

Cys lie Cys Thr Val Val Gin Arg Cys Ala Ser Asn Lys Pro His Val 

1 • -5. . • 10 • 15 



Leu Glu Asp Pro Cys Lys 

20 



<210> 192 

<211> 22 • 
<2\2> PRT 

<213> Severe acute respiratory syndrome virus 

I 

I 

<400> 192 

♦ 

Val Val Ala Val lie Gin Glu lie Gin Leu Leu Ala Ala Val Gly Glu' 
1 5 ' 10 - 15 



lie Leu Leu Leu Glu Trp 

20 



<210^ 193 

<211> 19' 

<212> DNA • . . 

<213> Artificial Sequence 

<220> 

<223> Linker- • ^ 

<400> 193. 

aattcgcggc cgcgtcgac .19 



<210> .194 

<211> 15 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Linker 

<400> 194 

gtcgacgcgg ccgcg 15 



<210> 195 

<211> 19 

<212> DNA 

<213> Artificial Sequence 



<220> 



218 



wo 2004/096842 



PCT/CA2004/000626 



<223> Primer 

<400> .195 
. aattcgcggc cgcgtcgac 



19 



<210> * 196 
<211> 19 
<212> DNA 

<213> -Artificial Sequence 
<220> 

<223>- Primer • 
<400> 196 

ggcctcttcg ctattacgc 



19 



<210> 197 

<211> 21 

<212> DNA 

<213> Artificial Sequence 

<220> ' ' 

<223> Primer 

<400> 197 

tgcaggtcga ctctagagga t 



21 



<210> 198 . • ' . I 
<211> ^10 . . 
<212> PRT 

<213> Avian infectious bronchitis virus 
<400> .198 

• ■ 

Met Ala Ser Gly Lys Ala Ala Gly Lys Thr A6p Ala Pro Ala Pro Val 

1 5 10 . . 15 



He Lys Leu Gly Gly Pro Lys Pro Pro Lys Val Gly Ser Ser Gly Asn 

20 25 30 



Ala Ser Trp Phe Gin- Ala He Lys Ala Lys Lys Leu Asn Thr Pro Pro 
35 40 ■ 45 



Pro Lys Phe Glu Gly Ser Gly Val Pro Asp Asn Glu Asn He Lys Pro 
50 55 60 



Ser Gin Gin His Gly Tyr Trp Arg Arg Gin Ala Arg Phe Lys Pro Gly 
65 70 75 80 



Lys Gly Gly Arg Lys Pro Val Pro Asp Ala Trp Tyr Phe Tyr Tyr Thr 

85 90 95 
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♦ 

I.' 

Gly Thr Gly Pro Ala Ala Asp Leu Asn Trp Gly Asp Thr Gin Asp Gly 

100 * ♦ 105 " 110 



lie Val Trp Val Ala Ala Lys Gly Ala Asp Thr Lys Ser- Arg Ser Asn 
115 120 125 

Gin Gly Thr h±g Asp Pro Asp Lys Phe Asp Gin Tyr Pro Leu Arg Phe 
130 135 • 140 



4 • 



4 « 



§e.r Asp Gly Gly Pro Asp Gly Asn Phe Arg Trp Asp Phe He Pro .Leu 
145 . .150- 155 160 



Lys Asn Arg Gly Arg Ser Gly Arg Ser Thr Ala Ala Ser Ser Ala Ala 

165 170 175 



Ala Ser Arg Ala Pro Ser Arg Glu Gly Ser Arg Gly Arg Arg Ser Asp 

ISq .185 190 • 



Ser Gly Asp Asp Leu He Ala Arg Ala Ala Lys He He Gin Asp Gin 
195 200 205 



Gin Lys* Lys Gly Ser . Arg He Thr Lys Ala Lys Ala Asp Glu Met Ala • ' 
210 215 220 



His Arg Arg Tyr Cys Lys Arg Thr Jle Pro- Pro Asn Tyr Arg Val Asp 
225 23.0 235 240 



Gin Val Phe Gly Pro Airg Thr Lys Gly Lys Glu Gly Asn Phe Gly Asp 

245 • 250 255 



Asp Lys Met Asn Glu Glu Gly He Lys Asp Gly Arg Val Thr Ala Met 

260 265 270 



Leu Asn Leu Val Pro Ser* Ser His Ala Cys Leu Phe Gly Ser Arg Val 
275 280 285 



Thr Pro Lys Leu Gin Leu Asp Gly Leu His Leu Arg Phe Glu Phe Thr 
290 295 300 



Thr Val Val Pro Cys Asp Asp Pro Gin Phe Asp Asn Tyr Val Lys He 
305 310 315 320 



Cys Asp Gin Cys Val Asp Gly Val Gly Thr Arg Pro Lys Asp Asp Glu 

325 330 335 



Pro Lys Pro Lys Ser Arg Ser Ser Ser Arg Pro Ala Thr Arg Gly Asn 
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340 345 350 



Ser Pro. Ala Pro Arg Gin Gin Arg Pro Lys Lys Glu Lys Lys Leu Lys 
355: 360 365 



Lys Gin Asp Asp Glu Ala Asp Lys Ala Leu Thr Ser Asp Glu 'Giu Arg 

370 375 380 | 

Asn Asn Ala Gin Leu Glu Phe Tyr Asp Glu Pro Lys Val He Asn Trp 

385 • • ' 390 395 400 



Gl^ Asp Ala Ala Leu Gly Glu Asn Glu Leu 

405 410 



<210> 199 : • 

<211> 30 

<212> PRT- 

<213> conotoxin 

<400> 199 

Cys He Ala Val Gly Gin Leu Cys Val Phe Trp Asn lie Gly Arg Pro 
1 " 5 10 15 ' 



Cys Cys Ser Gly Leu Cys Val Phe Aia Cys Thr Val Lys Leu 

20 25 ■ ' " • 30 



<210> 200 

<211> 31 

<212> PRT 

<213> Severe acute respiratory . syndrome virus 

<400> 200 

Cys He Ser Leu Cys Ser Cys He Cys Thr Val Val Gin Arg Cys Ala 
1 5 10 ' 15 



Ser Asn Lys Pro His- Val Leu Glu Asp Pro Cys Lys" Val Gin His 

20. 25 ' 30 



<210> 201 
<211> 310 
<212> DNA 

<213> Severe acute respiratory syndrome virus 
<400> 201 

cgatgtttca tcttgttgac ttccaggtta caatagcaga gatattgatt atcattatga 60 
ggactttcag gattgctatt tggaatcttg acgttataat aagttcaata gtgagacaat 120 
tatttaagcc tctaactaag aagaattatt cggagttaga tgatgaagaa cctatggagt 180 
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« 

tagattatcc ataaaacgaa catgaaaatt attctcttcc tgacattgat tgtatttaca 240 
tcttgcgagc tatatcacta tcaggagtgt gttagaggta cgactgtact actaaaagaa ' 300 

* ■ • * * • 

ccttgcccat ^IQ 



<210>' 202 ' ' 
<211> 556 
<212> DNA* 

<213> Severe acute respiratory syndrome virus 
<400> 202 

agaaagacag aatgaatgag ctcactttaa ttgacttcta tttgtgcttt ttagcctttc 60 
« ■ 

' 4 

tgctattcct tgttttaata atgcttatta tattttggtt ttcactcgaa atcpaggatc 120 

tagaagaacc ttgtaccaaa gtctaaacga apatgaaact tctcattgtt ttgacttgta 180 

tttctctatg cagttgcata tgcactgtag tacagcgctg tgcatctaat aaacctcatg 240 

• • • 

tgcttgaaga tcct,tgtaag gtacaacact aggggtaata cttatagcac tgcttggctt 300 

tgtgctctag gaaaggtttt accttttcat agatggcaca ctatggttca aacatgcaca 360 

cctaatgtta ctatcaactg tcaagatcca gctggtggtg cgcttatagc taggtgttgg . 420 

taccttcatg aaggt caeca aactgctgca tttagagacg tacttgttgt tttaaataaa 480 

cgaacaaatt aaaatgtctg ataatggacc ccaatcaaac caacgtagtg ccccccgcat 540 

tacaittg3t ggaccc 556 



<210>, 203 

<211> 1255 

<212> PRT 

<213> Severe acute respiratory syndrome virus 

<400> 203 

Met Phe lie Phe Leu Leu Phe Leu Thr Leu Thr Ser Gly Ser Asp. lieu 
1 5 ■ . 10 -15 



Asp Arg Cys Thr Thr Phe Asp Asp Val Gin Ala Pro Asn Tyr Thr Gin 

20 25 30 



His Thr Ser Ser Met Arg Gly Val Tyr Tyr Pro Asp Glu lie Phe Arg 
35 '40 -45 



Ser Asp Thr Leu Tyr Leu Thr Gin Asp Leu Phe Leu Pro Phe Tyr Ser 
50 55 60 



Asn Val Thr Gly Phe His Thr He Asn His Thr Phe Gly Asn Pro Val 
65 70 75 80 
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♦ • • * 

lie Pro Phe Lys Asp Gly lie Tyr Phe Ala Ala Thr Glu Lys Ser Asn 

85 90 ■ ■ *" * 95 



Val Val Arg Gly Trp Val Phe. Gly Ser Thr Met Asn Asn '..Lys Ser Gin 

100 105 • 110 



Ser Val He He He Asn Asn Ser Thr Asn Val Val He Argj Ala Cys 
115 * 120 125 



Asn Phe Glu Leu Cys Asp Asn Pro Phe -Phe Ala Val' Ser Lys Pro .Met 
130 .135 140 



Gly Thr Gin Thr His Thr Met He Phe Asp Asn Ala Phe Asn Cys Thr 
145. 150 155 160 



Phe Glu Tyr. He Ser Asp Ala Phe Ser Leu Asp Val Ser* Glu Lys Ser 

165 176. 175 



Gly Asn Phe Lys His Leu. Arg Glu Phe Val Phe Lys Asn Lys Asp Gly 

. 180 185 190 



Phe Leu Tyr Val Tyr Lys Gly Tyr Gin Pro He Asp Val Val Arg Asp 
195 200 205 



Leu Pro Ser Gly Phe Asn Thr Leu Lys Pro He Phe Lys Leu Pro Leu. • 
210 215 220 



Gly He Asn He Thr Asn Phe Arg Ala He Leu Thr Ala Phe Ser Pro 
225 230 ■ 235 240 



Ala Gin Asp He Trp Gly Thr Ser Ala Ala Ala Tyr Phe Val Gly Tyr 

245 '250 255 • 



Leu Lys Pro Thr Thr Phe Met Leu Lys Tyr Asp Glu Asn Gly Thr He 

260 265 270 



Thr Asp Ala Val Asp Cys Ser Gin Asn Pro Leu Ala Glu Leu Lys Cys 
275 280 ' 285 



Ser Val Lys Ser Phe Glu He Asp Lys Gly He Tyr Gin Thr Ser Asn 
290 295 300 



Phe Arg Val Val Pro Ser Gly Asp Val Val Arg Phe Pro Asn- He Thr 
305 310' 315 320 



Asn Leu Cys Pro Phe Gly Glu Val Phe Asn Ala Thr Lys Phe Pro Ser 

223 
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325 330 * . 335 



Val Tyr Ala Trp Glu Ar^ hys Lys lie Ser Asn Cys Val Ala Asp Tyr 

340 345 350 



' Ser Val Leu Tyr 'Asn Ser Thr Phe Phe Ser Thr Phe Lys Cys Tyr Gly 
355 360 365 



Val Ser Ala Thr Lys Leu Asn Asp Leu Cys Phe Ser Asn Val Tyr Ala 
370 375 380 



Asp Ser Phe Val Val Lys Gly Asp Asp Val Arg Gin lie. Ala Pro Gly 
385 390 395 400 



Gin Thr Gly Val lie Ala Asp Tyr Asn Tyr Lys Leu Pro Asp Asp. Phe 

405 410 415* 



Met Gly Cys Val Leu Ala Trp Asn Thr Arg- Asn lie Asp Ala Thr Ser 

420 425 430 



Thr Gly Asn Tyr Asn Tyr Lys Tyr Arg Tyr Leu Arg His Gly Lys Leu 
435 440 445 



Arg Pro Phe Glu Arg Asp He Ser Asn Val Pro Phe Ser Pro Asp Gly 
450 455 460 



Lys Pro Cys Thr Pro -Pro Ala Leu Asn Cys Tyr Trp Pro Leu Asn Asp 
465 470 475 480 



Tyr Gly Phe Tyr Thr Thr Thr Gly He Gly Tyr Gin Pro Tyr Arg Val 

485 490 495 



Val Val Leu Ser Phe Glu Leu Leu Asn Ala Pro Ala Thr Val Cys Gly 

500 505 510 



Pro Lys Leu Ser Thr Asp Leu He Lys Asn Gin Cys Val Asn Phe Asn 
515 520 525 



Phe Asn Gly Leu Thr Gly Thr Gly Val Leu Thr Pro Ser Ser Lys Arg 
530 535 540 



Phe Gin Pro Phe Gin Gin Phe Gly Arg Asp Val Ser Asp Phe Thr Asp 
545 550 555 560 



Ser Val Arg Asp Pro Lys Thr Ser Glu He Leu Asp He Ser Pro Cys 

565 570 575 
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Ala Phe Gly Gly Val Se.r Val lie Thr Pro Gly Thr Asn Ala S«r Ser 
• 580 . 585 590 



Glu Val Ala Val Leu Tyr Gin Asp Val Asn Cys Thr Asp Val Ser Thr 
' 595 . 600 605 

I 

Ala lie His Ala Asp Gin Leu Thr Pro Ala Trp Arg lie Tyr Ser Thr 
610 615 * 620 . 



Gly Asn Asn. Val Phe Gin Thr Gin Ala Gly Cys Leu. lie Gly Ala iSlu 
625 . 630 635 640* 



His Val Asp Thr Ser Tyr Glu Cys Asp lie Pro lie Gly Ala Gly lie 

645 650 655 ' 



Cys Ala Ser Tyr His Thr Val Ser Leu Leu Arg Ser Thr Ser Gin Lys 

660 665- 670 



Ser lie Val Ala Tyr Thr Met Sep Leu Gly Ala Asp Ser Ser lie Ala 
675 680 685 



Tyr Ser Asn Asn Thr lie Ala lie Pro Thr Asn .Phe Ser lie Ser lie 
^^0 695 700 



Thr Thr Glu Val Met Pro Val Ser Met Ala Lys Thr Ser Val Asp Cys 
705 710 ' 715 720 



^^sn Met .Tyr He, Cys Gly Asp Ser Thr Glu Cys Ala Asn Leu Leu Leu 

725 730 735 . 



Gin Tyr Gly Ser Phe Cys Thr Gin Leu Asn Arg Ala Leu Ser Gly He 

740 745 750 



Ala Ala Glu Gin Asp Arg Asn Thr Arg Glu Val Phe Ala Gin Val Lys 
755 760 765 



Gin Met Tyr Lys Thr Pro Thr Leu Lys Tyr Phe Gly Gly Phe Asn Phe 
770 775 780 



Ser Gin He Leu Pro Asp Pro Leu Lys Pro Thr Lys Arg Ser Phe He 
785 790 795 800 



Glu Asp Leu Leu Phe Asn Lys Val Thr Leu Ala Asp Ala Gly Phe Met 

805 810 815 
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I*ys Gin Tyr Gly Glu Cys hen Gly Asp lie Asn Ala Atg Asp Leu He 

820 • • 825 830 



Cys Ala Gin Lys Phe Asn Gly Leu Thr Val Leu Pro Pro Leu Leu Thr 
835 . 840 845 

♦ 

Asp Asp Met He Ala Ala Tyr Thr Ala Ala Leil Val Ser Gly Thr Ala 
850 . . 855 860 



Thr Ala Gly Trp Thr Phe Gly Ala Gly Ala Ala Leu Gin He Pro Phe 
865 870 875 * 880 



Ala Met Gin Met Ala Tyr. Arg' Phe Asn Gly He Gly Val Thr Gin Asn 

885 890 895 



Val Leu Tyr Glu Asn Gin Lys Gin He Ala Asn Gin Phe Asn Lys Ala 

900 905 910 



He Ser Gin He Gin Glu Ser Leu Thr Thr Thr Ser Thr Ala Leu Gly 
915 920 925 



Lys Leu Gin Asp Val Val Asn Gin Asn Ala Gin Ala Leu Asn Thr Leu 
930 935 I 940 



Val Lys Gin Leu Ser Ser Asn Phe Gly Ala He Ser Ser Val Leu Asn 
945 \ 950 955 -960 



Asp He Leu Ser Arg Leu Asp Lys Val Glu Ala Glu Val Gin He Asp 

965 970 975 



Arg Leu He Thr Gly Arg Leu Gin Ser Leu Gin Thr Tyr Val Thr Gin 

980 985 990 



Gin Leu He Arg Ala- Ala Glu He Arg Ala Ser Ata Asn Leu Ala Ala 
995 1000 1005 



Thr Lys Met Ser Glu Cys Val Leu Gly Gin Ser Lys Arg Val Asp 
1010 1015 1020 



Phe Cys Gly Lys Gly Tyr His Leu Met Ser Phe Pro Gin Ala Ala 
1025 1030 1035 



Pro His Gly Val Val Phe Leu His Val Thr Tyr Val Pro Ser Gin 
1040 1045 1050 
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Glu Arg Asn Phe Thr Thr Ala Pro* Ala He Cys His Glu Gly Lys 
1055 * 1060 . . 1065 



Ala Tyr Phe Pro Arg Glu Gly Val Phe Val Phe Asn G.ly Thr Ser 
1070 1075 1080 



Trp Phe He Thr Gin Arg Asn Phe Phe Ser Pro Gin Ilei He Thr 
1085 .1090 1095 ' 



Thr Asp Asn Thr Phe Val Ser Gly Asn Cys Asp Val Val . He Gly 
1100 . 1105 1110 



He He Asn Asn THr Val Tyr Asp Pro Leu Gin Pro Glu Leu Asp. 
1115 1120 1125 



Ser Phe Lys Glu Glu Leu Asp Lys Tyr Phe Lys Asn His Thr Ser 
• 1130 1135 1140 



Pro Asp Val Asp Leu Gly Asp He Ser Gly He Asn Ala Ser Val 
1145 1150 1155 



Val Asn He Gin Lys Glu He Asp Arg Leu Asn Glu Val Ala Lys ' 
. 1160 1165 • 1170 



Asn Leu Asn Glu Ser Leu He Asp Leu Gin Glu Leu Gly Lys Tyr 
1175 1180 1185 



Glu Gin Tyr He Lys Trp Pro Trp Tyr Val Trp Leu Gly Phe He 
1190 1195 1200 



Ala Gly Leu He Ala He Val Met Val Thr He Leu Leu Cys Cys 
1205 1210 . 1215 



Met . Thr Ser Cys Cys Ser Cys Leu Lys Gly Ala Cys Ser Cys Gly 
1220 1225 • 1230 



Ser Cys Cys Lys Phe Asp Glu Asp Asp Ser Glu Pro Val Leu Lys 
1235 1240 . ■ 1245 



Gly Val Lys Leu His Tyr Thr 
1250 1255 



<210> 204 

<211> 422 

<212> PRT 

<213> Severe acute respiratory syndrome virus 



■ * 
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<400> 204 ' ' . . ' 

Met Ser Asp Asn Gly Pro Glp Ser Asn Gin Arg Ser Ala Pro Arg lie 

1 • . 5 10 • 15 

• * 

. . . • . 

Thr Phe Gly Gly P.ro Thr Asp Ser Thr Asp Asn Asn Gin Asn Gly Giy 

20'"' . 25 .30 • 



Arg Asn Gly Ala Arg Pro Lys Gin Arg Arg Pro Gin Gly Leu Pro Asn 
35 40 " 45 



Asn Thr Ala Ser Trp Phe Thr Ala Leu Thr Gin His Gly Lys Glu Glu 
• 50 55 60 . 



Leu Arg Phe Pro Arg Gly -Gin Gly Val Pro He Asn Thr Asn Ser Gly 
65 . 70 * 75 80 



Pro Asp Asp Glri He* Gly Tyr Tyr Arg Arg Ala Thr Arg Arg Val Arg 

85 ' . 90 ; 95 



Gly Gly Asp Gly Lys Met Lys Glu Leu Ser Pro Arg Trp Tyr Phe Tyr 

100 105 110 



Tyr Leu Gly Thr Gly Pro Glu Ala Ser Leu Pro Tyr Gly Ala Asn Lys 
115 120 ' • 125 ' 



Glu Gly lie Val Trp Val Ala Thr Glu Gly Ala Leu Asn Thr Pro Lys 
130 , 135 140 



Asp His He Gly Thr Arg Asn Pro Asn Asn Asn Ala Ala Thr Val Leu 
145 150 ' 155 160 



Gin Leu Pro Gin Gly Thr Thr Leu Pro Lys Gly Phe Tyr Ala Glu Gly 

165 170 - 175 



Ser Arg Gly Gly Ser Gin AJ.a Ser Ser .Arg Ser Ser Ser Arg Ser Arg 

180 185 190 



Gly Asn Ser Arg Asn Ser Thr Pro Gly Ser Ser Arg Gly Asn Ser Pro 
195 200 205 



Ala Arg Met Ala Ser Gly Gly Gly Glu Thr Ala Leu Ala Leu Leu Leu 
210 215 220 



Leu Asp Arg Leu Asn Gin Leu Glu Ser Lys Val Ser Gly Lys Gly Gin 
225 230 235 240 
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Gin Gin Gin Gly Gin Thr Val Thr Lys Lys Ser Ala Ala Glu Ala Ser 

245 250- 255 • 



Lys Lys- Pro Arg Gin Lys Arg Thr Ala Thr Lys Gin Tyr Asn Val Thr 

260 265 270 

Gin Ala Phe Gly Arg Arg Gly Pro Glu Gin Thr Gin Gly AsrI Phe Gly 
275 280 285 



Asp Gin Asp Leu lie Arg Gin Gly Thr Asp Tyr Lys His Trp Pro Gin 
' 290 295 300 



He Ala Gin Phe Ala Pro Ser Ala Ser Ala Phe Phe Gly Met Ser Arg 
305 310 315 320 



He Gly Met Glu Val Thr Pro Ser Gly Thr Trp Leu Thr Tyr His Gly 

325 330 335, 



Ala. He Lys Leu Asp Asp Lys Asp Pro Gin Phe Lys Asp Asn Val He 

340 345 350 ' * 



Leu Leu Asn Lys His He Asp Ala Tyr Lys Thr Phe Pro Pro Thr Glu 
355 360 ' I 365 



Pro Lys Lys Asp Lys Lys Lys Lys Thr Asp Glu Ala Gin Pro Leu Pro 
370' 375 380 • 



Gin Arg Gin Lys Lys Gin Pro Thr Val Thr Leu Leu Pro Ala Ala Asp 
385 390 395 400 



Met Asp Asp Phe Ser Arg Gin Leu Gin Asn Ser Met Ser Gly Ala Ser 

405 410 415 



Ala Asp Ser Thr Gin Ala 

420 



<210> 205 

<211> 221 

<212> PRT 

<213> Sars associated coronavirus 

<400> 205 

Met Ala Asp Asn Gly Thr He Thr Val Glu Glu Leu Lys Gin Leu Leu- 
1 5-10 15 



Glu Gin Trp Asn Leu Val He Gly Phe Leu Phe Leu Ala Trp . He Met 
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20 25 ' 30 ' 



Leu he\i Gin Php Ala Tyr Ser Asn Arg Asn Arg Phe Leu Tyii lie lie 
35 40 45 



Lys Leu Val Phe»'fieu Trp Leu Leu Trp Pro Val Thr Leu Ala' Cys'Phe. 
50 55 60 



Val Leu Ala Ala Val Tyr Arg lie Ash Trp Val Thr Gly Gly lie Ala 
65 70 75 80 



lie Ala Met Ala Cys tie Val Gly Leu Met Trp Leu Ser Tyr Phe Val 

85 ' 90 95 



Ala Ser Phe Arg Leu Phe Ala Arg Thr Arg Ser Met Trp Ser Phe Asn 

100 105 . 110 



Pro Glu Thr Asn lie Leu Leu Asn Val Pro Leu Arg Gly Thr He Val 
115 120 • -125 



Thr Arg Pro Leu Met Glu Ser Glu Leu Val He Gly Ala Yal He He 
130 135 140 



Arg Giy His Leu Arg Met Ala Gly His Ser Leu Gly Arg Cys Asp He 
145 150 155 • 160 



Lys Asp Leu Pro Lys .Glu He Thr Val Ala Thr Ser Arg Thr Leu Ser 

165 170 175 



Tyr Tyr Lys Leu Gly Ala Ser Gin Arg Val Gly Thr Asp Ser Gly Phe 

180 185 .190 



Ala Ala Tyr Asn Arg Tyr Arg He Gly Asn Tyr Lys Leu Asn Thr Asp 
195 200 205 



His Ala Gly Ser Asn Asp Asn He Al^ Leu Leu Val Gin 
210 215 220 



<210> 206 
<211> 76 
<212> PRT 

<213> Severe acute respiratory syndrome virus 
<400> 206 

Met Tyr Ser Phe Val Ser Glu Glu Thr Gly Thr Leu He Val Asn Ser 
15 10 15 
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Val Leu Leu Phe Leu Ala Phe Val Val Phe Leu Leu V&i Thr Leu AJa 

20 • • ' 25 . 30 



He Leu. Thr Ala Leu Arg Leu Cys Ala Tyr Cys Cys Asn He Val Asn 
' 35 . 40 • 45 



Val Ser Leu Val Lys Pro Thr Val Tyr Val Tyr Ser Arg Val Lys Asn 
50 55 .60 



Leu Asn Ser Ser Glu Gly Val Pro Asp Leu Leu Val 
65 70 .75 
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Box II Observations where certain claims were found unsearchable (Continuation of item 2 of first sheet) 



This International Search Report has not been established in respect of certain claims under Article I7(2)(a} for the following reasons: 

1. [xl Claims Nos.: 65-66 i 

because they relate to subject matter not required to be searched by this Authority, namely: | 

Although claims 56-61 are directed to a method of treatment of the 
human/animal body, the search has been carried out and based on the alleged 
effects of the compound/composition. 

2. PI aaimsNos.: 3 in part 

— because they relate to parts of the Intemattonal Application that do not comply with (he prescribed requirements to such 
an extent that no meaningful International Search can be carried out, specificatty: 

see FURTHER INFORMATION sheet PCT/ISA/210 



3. Claims N08.: 

— because they are dependent claims and are not drafted in accordance with the second and third sentences of Rule 6^a). 

Box ill Observations where unity of invention is lacking (Continuation of Item 3 of first sheet) 

TNs International Searching Authority found multiple inventions in this Internationa! application, as follows: 

see additional sheet 



1 . I I As all required additional search fees were timely paid by the applicant, this International Search Report covers all 
I—' searchable claims. 



2. As all searchable claims could be searched without effort Justifying an additional I6e, this Authority did not Invite payment 
of any additional fee. 



3. I I As only some of the required additional search fees were timely paid by the applicant, this International Search Report 
1—1 covers only those claims for which fees were paid, specifically claims Has.: 



4, IvJ No required additional search fees were fimely paid by the applicant. Consequently, this International Search Report is 
restricted to the Invention first menUorved in the claims; it is covered by claims No&: 

1-5; 6-64 (in part) 



Remaric on Protest The additfonal search fees were accompanied by the applicant's protest 

I I No protest accompanied the payment of additional search fees. 
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Continuation of Box II. 1 

Although claims 5IS-61 are directed to a method of treatment of the. 
human/animal body, the search has been carried out and based on the 
alleged effects of the compound/composition. 



Continuation of Box. II.l 
Claims Nos.: 65-66 



The subject-matter of claims 65 and 66 relate only to the presentation 
of structural information and is not regarded as an patentable invention 
within the meaning of Rule 39.1(v) PCT. This information Is disclosed as 
nucleic acid / amino acid sequences and stored in the form of computer 
readable records. 



Continuation of Box I I. 2 
Claims Nos.: 3 in part 



The Sequence Listing as originally filed does not comprise Seq. Id. No. 
desigantors 208, 209.- Reference to these Seq. Id. Nos is unclear. 

The applicant's attention is drawn to the fact that claims relating to 
inventions In respect of which no international search report has been 
established need not be the subject of an international preliminary 
examination (Rule 66.1(e) PCT). The applicant is advised that the EPO 
policy when acting as an International Preliminary Examining Authority is 
normally not to carry out a preliminary examination on matter which has 
not been searched. This is the case Irrespective of whether or not the 
claims are amended following receipt of the search report or during any 
Chapter II procedure. If the application proceeds into the regional phase 
before the EPO, the applicant is reminded that a search may be carried 
out during examination before the EPO (see EPO Guideline C-VI, 8.5), 
should the problems which led to the Article 17(2) declaration be 
overcome. 



